towards a semantic web for heritage resources€¦ · semantic web should be based on well-founded...
TRANSCRIPT
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
TOWARDS A SEMANTIC WEB FOR
HERITAGE RESOURCES
Thematic Issue 3
4 DigiCULT
CONTENT
Guntram Geser
Introduction and Overview 5
Seamus Ross
Position PaperTowards a Semantic Web for Heritage Resources 7
Interview with Janneke van Kersen
Development of the Semantic Web Must Begin at the Grass Roots Level 12
Michael Steemson
DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14
Semantic Web Terms and Reading ListA-X 21
Interview with Nicola Guarino
Semantic Web should be based on Well-founded Ontologies 25
Guntram Geser
A Cultural Heritage Semantic Web Example amp Primer 26
The Darmstadt Forum Participants 38
DigiCULT Project Information 42
Imprint 43
FUNCTION AND FOCUS
DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector
To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal
pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here
March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies
In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature
TOPIC AND CHALLENGE
This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF
In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)
What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections
DigiCULT 5
INTRODUCTION AND OVERVIEWBy Guntram Geser
Philosophy in Discussion With a Philosopher
6 DigiCULT
and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure
The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies
OVERVIEW
Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous
Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions
in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs
Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies
In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications
Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2
We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue
1cf F Manola E Miller
RDF Primer (W3C Working
Draft 23 January 2003)
httpwwww3org
TRrdf-primer2See their online collection
of such images at
httpwwwkbnlkb
manuscripts which offers
advanced search and
presentation features
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
TOWARDS A SEMANTIC WEB FOR
HERITAGE RESOURCES
Thematic Issue 3
4 DigiCULT
CONTENT
Guntram Geser
Introduction and Overview 5
Seamus Ross
Position PaperTowards a Semantic Web for Heritage Resources 7
Interview with Janneke van Kersen
Development of the Semantic Web Must Begin at the Grass Roots Level 12
Michael Steemson
DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14
Semantic Web Terms and Reading ListA-X 21
Interview with Nicola Guarino
Semantic Web should be based on Well-founded Ontologies 25
Guntram Geser
A Cultural Heritage Semantic Web Example amp Primer 26
The Darmstadt Forum Participants 38
DigiCULT Project Information 42
Imprint 43
FUNCTION AND FOCUS
DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector
To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal
pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here
March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies
In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature
TOPIC AND CHALLENGE
This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF
In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)
What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections
DigiCULT 5
INTRODUCTION AND OVERVIEWBy Guntram Geser
Philosophy in Discussion With a Philosopher
6 DigiCULT
and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure
The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies
OVERVIEW
Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous
Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions
in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs
Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies
In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications
Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2
We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue
1cf F Manola E Miller
RDF Primer (W3C Working
Draft 23 January 2003)
httpwwww3org
TRrdf-primer2See their online collection
of such images at
httpwwwkbnlkb
manuscripts which offers
advanced search and
presentation features
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
4 DigiCULT
CONTENT
Guntram Geser
Introduction and Overview 5
Seamus Ross
Position PaperTowards a Semantic Web for Heritage Resources 7
Interview with Janneke van Kersen
Development of the Semantic Web Must Begin at the Grass Roots Level 12
Michael Steemson
DigiCULTrsquos Expert 13 Tangle with the Semantic Web 14
Semantic Web Terms and Reading ListA-X 21
Interview with Nicola Guarino
Semantic Web should be based on Well-founded Ontologies 25
Guntram Geser
A Cultural Heritage Semantic Web Example amp Primer 26
The Darmstadt Forum Participants 38
DigiCULT Project Information 42
Imprint 43
FUNCTION AND FOCUS
DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector
To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal
pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here
March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies
In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature
TOPIC AND CHALLENGE
This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF
In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)
What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections
DigiCULT 5
INTRODUCTION AND OVERVIEWBy Guntram Geser
Philosophy in Discussion With a Philosopher
6 DigiCULT
and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure
The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies
OVERVIEW
Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous
Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions
in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs
Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies
In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications
Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2
We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue
1cf F Manola E Miller
RDF Primer (W3C Working
Draft 23 January 2003)
httpwwww3org
TRrdf-primer2See their online collection
of such images at
httpwwwkbnlkb
manuscripts which offers
advanced search and
presentation features
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
FUNCTION AND FOCUS
DigiCULT as a support measure within theInformation Society Technologies Programme (IST)will for a period of 30 months (beginning March2002) provide a technology watch mechanism forthe cultural and scientific heritage sector Backedby a network of peer experts the project monitorsdiscusses and analyses existing and emergingtechnologies likely to bring benefits to the sector
To promote the results and encourage early take-up of relevant technologies DigiCULT has put inplace a rigorous publication agenda of sevenThematic Issues three in-depth Technology WatchReports as well as the DigiCULTInfo e-journal
pushed to a growing database of interested personsand organisations on a regular basisAll DigiCULTproducts can be downloaded from the project Web-site httpwwwdigicultinfo as they become avail-ableThe opportunity to subscribe to the Digi-CULTInfo is also found here
March 2003 saw the release of the first DigiCULTTechnology Watch ReportThis report covers thetopics Customer Relationship Management DigitalAsset Management Systems Smart Labels and SmartTagsVirtual Reality and Display Technologies Hu-man Interfaces and Games TechnologiesAddressingprimarily technological issues it serves as a guide towhat a heritage institution needs to consider whenbuying into one of these technologies
In comparison with the Technology Watch Reportsthe Thematic Issues focus more on the organisationalpolicy and economic aspects of the technologiesunder considerationThey are based on the expertround tables organised by the DigiCULT Forumsecretariat In addition to the Forum discussion theyprovide opinions of other experts in the form ofarticles and interviews case studies short descriptionsof related projects together with a selection ofrelevant literature
TOPIC AND CHALLENGE
This third Thematic Issue addresses the questionsWhat is the Semantic Web What will it do forheritage institutions And what is the role of certainlanguages in particular XML and RDF
In short the Semantic Web vision proclaims a Webof machine-readable data which allows softwareagents to automatically carry out rather complextasks for humans Key to realising this vision issemantic interoperability of Web resourcesYet suchinteroperability is not the primary goal of heritageinstitutions (and intelligent software agents are notreadily at hand)
What the institutions are looking for are new waysof providing scholarly and non-expert users (egschool classes lifelong learners) with access to theircollections and related knowledgeThis goal can beaccomplished for example through online collections
DigiCULT 5
INTRODUCTION AND OVERVIEWBy Guntram Geser
Philosophy in Discussion With a Philosopher
6 DigiCULT
and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure
The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies
OVERVIEW
Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous
Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions
in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs
Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies
In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications
Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2
We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue
1cf F Manola E Miller
RDF Primer (W3C Working
Draft 23 January 2003)
httpwwww3org
TRrdf-primer2See their online collection
of such images at
httpwwwkbnlkb
manuscripts which offers
advanced search and
presentation features
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
6 DigiCULT
and exhibitions that not only display objects andsimple descriptions (drawn from metadata) but alsoallow for understanding relationships between objects(created by semantically interrelated metadata)TheSemantic Web community promises to assist inachieving this goal but the challenge for the herit-age institutions would be to first implement thenecessary data infrastructure
The challenge for the Semantic Web expertround table was or at least the DigiCULT Secretariatthought it was not to run into a debate betweenlsquotheoryrsquo and lsquopracticersquo In other words between whatacademic Semantic Web scholars and what practi-tioners from heritage institutions think needs to beaccomplished what is feasible and affordable andwhere to concentrate efforts For the discussionXML seemed to provide a good starting pointXML on the one hand is increasingly considered byheritage institutions as a key standard for publishingmetadata on the Web on the other hand it is a majorbuilding block for the Semantic Web It proveddifferent in a positive sense In the discussion wideuse of XML was taken for granted while the keyarea of interest that surfaced and was seen to bemost fruitful to explore was ontologies
OVERVIEW
Setting the context for this Issue the positionpaper looks into the requirements for achieving thegoals of the Semantic Web and assesses whether theavailable technologies will be able to deliver on whatthe advocates of the Semantic Web envisage as wellas whether the cultural heritage sector is in a positionto take substantial steps towards semantic interopera-bility It concludes with the argument that the sectoris more likely to be left behind due in particular tothe fact that for the institutions the rewards for thenecessary investments are still too nebulous
Janneke van Kersen from the Dutch DigitalHeritage Association in her interview with theDigiCULT Journalist suggests that despite thecloudy Semantic Web horizon there are medium-term benefits to be gained for heritage institutions
in taking steps towards the visionAnd she states thatit is up to associations like hers together with largerinstitutions to take the lead in this prove that pro-posed solutions work and support smaller institutionsin taking advantage of them On the other handNicola Guarino in his interview believes thatreaching the lsquorealrsquo Semantic Web lies in takinglsquothe fundamental routersquo of implementing genericontologies based on linguistics and logics withinthe Semantic Web fabric He also claims that evenincremental progress along this path can haveremarkable pay-offs
Michael Steemsonrsquos summary of the Darm-stadt Forum illustrates that the Semantic Web topicresembles a labyrinth with currently no definitemap or Ariadnersquos Thread at hand Building on themany technologies the Forum participants mentionedas some of the labyrinthrsquos angles we have added tothe summary a list of resources related to thesetechnologies
In an effort to raise the veil of mystery surround-ing the Semantic Web this issue includes an examplefrom the sector on the implementation of semanticinteroperability of metadata combined with a primerthat explains core building blocks such as XMLRDF and ontologiesWhile a detailed primer of forexample RDF would alone exhaust the limits of thisissue1 the goal here is to deliver an lsquoall-inclusiversquoprimer within the space permitted with all theinevitable limitations this entailsThe primer attemptsto provide a general understanding of the SemanticWeb architecture without obliging the reader towander through the long and perplexing corridors of language specifications
Finally we want to thank the KoninklijkeBibliotheek National Library of the Netherlandsfor their kind permission to use selected images fromtheir collection of illuminated medieval manuscripts2
We hope you will appreciate the little narratives theyrepresent within the overall fabric of this DigiCULTThematic Issue
1cf F Manola E Miller
RDF Primer (W3C Working
Draft 23 January 2003)
httpwwww3org
TRrdf-primer2See their online collection
of such images at
httpwwwkbnlkb
manuscripts which offers
advanced search and
presentation features
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Tim Berners-Lee and his colleagues at W3Chave recognised that the real benefits of theweb-based information revolution will come
from enabling the interoperability of contentThecurrent generation of web delivery is they haveargued designed for human users who struggle tomake effective use of the billions of pages of infor-mation currently accessibleWhen we search forsomething at the moment we sometimes discoversuitable candidate information but more often thannot this is far from being the case More than thisthe entire process of searching discovery and use isdesigned to be driven by humansWhen we discoverone piece of the puzzle we need manually to positionthat information so that it can help us to search outthe next piece of the puzzleWe find that Darmstadtis near FrankfurtThen we find that there are flightsfrom Glasgow to Frankfurt and there is a bus fromFrankfurt Airport to DarmstadtThen I search fortimetables make manual comparisons and decidewhich times best meet my requirements In theShangri-La that is the Semantic Web my lsquoagentrsquo wouldrecognise from its regular review of my diary that I needed to be at a meeting in Darmstadt on the 21stof January 2003 and it would search out the options
DigiCULT 7
POSITION PAPER
By Seamus Ross
Genesis ndash The Creation Division of Light and Darkness
analyse the timetables identify the optimum travelarrangements book my non-smoking hotel accom-modation and order the taxi to take me to theairport (It might even check the weather forecastsand warn me to bring particular types of clothing)Certainly to make this happen there has to be afundamental shift in the way data information andknowledge are represented on the web
The proliferation of web-based resources makesfinding what you are looking for increasingly difficultAccording to Internet user studies in 1996 50 ofInternet users reported spending time looking forinformation without finding it but by 2002 onlyabout 40 of users ended their lsquosearching sessionsrsquounsuccessfullyAt first glance we might conclude thatweb discovery tools have improved andor theinformation searching skills of users have improvedOver the past seven years the quantity of content hasmushroomed the search tools have become moreefficient developers approach the use of meta-tagsmore effectively and anecdotal evidence suggests thatthe searching techniques of users have become moresophisticatedWe should continue to be surprised bythe high failure rate and wonder why it remainsproportionally so high as the numbers of users havegrown to nearly 600 million In reality there is justtoo much content available It is poorly described Itis not interconnected Search engines themselves areblunt instruments Most users of the web do not havevery mature searching strategies and rarely use eventhe blunt instruments as effectively as they mightAsolution is to make more of the information capableof discovery interpretation and reuse by automatedinformation processing tools themselves However thecurrent ways content is represented on the web makesit nearly impossible for machines to search the webmeaningfully and effectively ndash even with the limi-tations of their skills and tools humans are better atsearching the web than the most powerful of thecurrent generation of agentsThe emergence of theSemantic Web would solve this problem
The web has made us realise the tremendouspotential of digital resources and made them widelyavailable Content as presented on the web currently is
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
8 DigiCULT
Genesis ndash The CreationDivision of the Waters Above and Below the Firmament
mute By adding descriptive information to contentand resources and representing both the descriptiveinformation and the content in well-definedconsistent and structured ways lsquomechanised agentsrsquocould be enabled to use web information lsquointelli-gentlyrsquoTim Berners-Lee Jim Hendler and manyother researchers believe that commercial and publicsector institutions are increasingly recognising thebenefits of ensuring that their content is adequatelyrepresented so that it is visible and discoverablewithin the context of the Semantic Web
The Semantic Web will enable the heritage sectorto make its information available in meaningful waysto researchers the general public and even its owncuratorsThe public will be able to plan visits toinstitutions by for example dynamically relatingopening times to public transport schedules Useinformation to discover whether or not that Vase inthe attic or basement is really Ming as their grand-mother claimed by comparing it to the holdings ofheritage institutions across the world Curators willbenefit from the ability to define an exhibition andhave the entire process from the identification ofthe pieces to be shown in the exhibition to theproduction of the catalogue and publicity materialautomatically handled by their lsquoexhibition agentsrsquo
TOWARDS AN INTEROPERABLE
SEMANTIC WEB FOR HERITAGE
RESOURCES
Delivering the Semantic Web to the heritagesector depends upon (a) the syntactical and
semantic mark-up of content (b) the development of better knowledge analysis and modelling tools(c) widespread adoption of interoperable knowledgerepresentation languages and (d) the construction ofsuitable ontologies In most of this the heritage sectoris lagging behindWe have not yet successfully repre-sented sufficient quantities of our data in ways thatmakes it accessible to human web users let alone inways that would make it feasible for lsquomechanisedagentsrsquo to reason about in meaningful ways lsquoLanguagesfor representing data and knowledge are an importantaspect of the Semantic Webrsquo (Klein 2001 26)Thelanguages that are currently the focus of the mostsubstantial discussion such as the RDF DAML+OILand OWL1 do not necessarily provide a suitableframework for delivering the Semantic WebThispoint has been increasingly argued in the literaturealthough in practice we still tend to emphasise thepossibilities of representation mechanisms such as
RDF(S) because it provides a flexible and extensiblemechanism to represent metadataA debate is ragingabout which language should be used to representsemantics on the web Resource Description Frame-work (RDF) an XML based mechanism for express-ing metadata has been put forward at the basic levelbut there is a growing body of opinion that indicatesit does not have the richness that is necessary to makea suitable language One of its shortcomings is that itcannot support syntax In response other languagessuch as DAML+OIL have been developedAs anindication of the current levels of flux in a funda-mental paper Patel-Schneider and Simeacuteon from BellLabs Research remark that lsquohellipthere is a semanticdiscontinuity at the very bottom of the Semantic Webinterfering with the stated goal of the Semantic WebIf Semantic languages do not respect World-Wide Webdata then how can the Semantic Web be an extensionof the World-Wide Web at allrsquo (2002a 147)
The strength of XML is that it does not itselfconstrain how the data will be interpretedWhileXML does not imply a specific interpretation of thedata how the material is marked up does constrainhow it can be used Fallside (2001) has made plainthe weaknesses of using DTDs as a way of specifyingsemantic properties in XML (eXtensible MarkupLanguage) XML Schemas offer a solution to theseweaknesses especially where those weaknesses arisefrom representational problems On the other handthe hierarchical nature of XML does not fit alldomains it lsquodoes not encode the datarsquos use andsemanticsrsquo and DTDs and XML Schemas do notspecify the datarsquos meaning although they do specifythe names of elements and attributesWill theSemantic Web produce different levels of sophis-tication in the representation of data and knowledgein the web-world If it does will this create a patchyrepresentation of web information that will makethe Semantic Web of limited value
1 See the lsquoSemantic Web
Terms and Reading
Listrsquo in this Issue
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
DigiCULT 9
ONTOLOGIES ndash THE JEWELS
OF THE SEMANTIC WEB
For the Semantic Web to succeed it will requirenot only modelling languages such as XML
RDF and OWL but it will also require methodo-logies for extracting and defining the knowledgethat is to be represented Decades of research andcommercial attempts to exploit the knowledge-based systems have demonstrated the complexity ofknowledge modelling Until there is such a metho-dology the possibilities of XML (or any other tech-nology) as a knowledge representation language will not be achieved
The success of the Semantic Web will dependheavily upon the creation of suitable ontologiesTo avoid adding new variants to definitions we willfollow James Hendlerrsquos definition of ontology as lsquoaset of knowledge terms including the vocabulary thesemantic interconnections and some simple rules ofinference and logic for some particular topicrsquo(Hendler 2001 30) One of the major hurdles facingus in building the Semantic Web is the lack of suit-able ontologies Languages such as OWL enableontologies to represent lsquoclass taxonomiesrsquo and providemechanisms to enable their rapid development Forexample concepts and relationships can be estab-lished such as lsquowatercolour is a type of paintingrsquo orlsquoa necklace is a type of jewelleryrsquo But what abouttheir multilingual capabilities An ontology may wellknow that a lsquowatercolour is a paintingrsquo but it doesnot necessarily mean that it knows that an lsquoaquarelleis a type of paintingrsquo or that a lsquowatercolour is a typeof peinturersquo In addition and probably first we needto consider| Can we cost the creation of appropriate onto-
logies for the heritage sector| How can we prioritise the ontologies that are
needed (eg which ones should the heritage sector develop and which ones will we be ableto borrow from other sectors)
| What heritage-based organisations should focuson ontology creation
| Ontologies often fail to be interoperableWhat solutions are there to this problem andhow can they be made to work effectively
| Does OWL (W3Crsquos Web Ontology Semantic Markup Language for publishing and sharing ontologies) provide a suitable mechanism for ontology creation for the heritage sector
Gomez-Perez and Corcho (2002) in an analysis oflsquoOntology Languages for the Semantic Webrsquo found
that the measure of expressiveness in the currentgeneration of ontology creation languages is aspectrum from XOL RDF(S) SHOE OML OIL toDAML+OIL at the richest end of the scale Indeedin their experience while any of these languages willwork for very simple ontologies any attempt to use aweak language to create a complex ontology will fail
Proof and trust is emerging as another centralissue How do we know that what our agent hasdiscovered through its trawl of the Semantic web canbe trusted Even in the case of ontologies how shouldwe decide whose ontology to trust This is especiallyimportant where the two ontologies may conflictwith one another Similarly we are faced with thedifficulties of ensuring and maintaining semanticintegrity and a lack of methods for testing itspresence
LEGITIMISING THE SEMANTIC WEB
INVESTMENT
Heflin and Hendler (2001) make the valuableproposal that semantic markup should be seen
as one aspect of webpage designThis in their viewwould go a long way to ensuring that the costs ofthis mark-up (and the underlying informationanalyses that is necessary to make it happen) weremet at the appropriate stage of process of puttingmaterial up on the web However Heflin andHendlerrsquos proposal that semantic mark-up should beembedded into web-page design fails to recognisethat the fundamental fabric of the web is changingFor this to happen we need a stronger argument forthe benefits that such investment will bring to theheritage sector
Haustein and Pleumann (2002) have noted that thesuccessful development of the World Wide Webbenefited from two factors lsquoParticipation was simpleand the results of effort were immediately visible tothe creatorrsquoAs they argue while these two successcriteria best classified in my view as ease of use andinstant gratification were characteristic of the WWWthey are not embedded into the fabric of the Seman-tic WebThe Semantic Web is hard and rewards areneither immediate nor assuredWhile in the longterm it may bring tremendous benefits the near-term take-up will be slow
At least three other factors contributed to foster-ing the success of the web Firstly the early webdevelopments concentrated on content creation andnot on the creation of representation languagesTheinitial instantiation of HTML was simple but itworked and material tagged using it remained
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
10 DigiCULT
Genesis ndash The CreationDivision of Sea and EarthCreation of Trees and Plants
accessibleWe were not forced to dispose of workthat we had lsquowebisedrsquo unless we wished to replace itsrepresentation with more sophisticated onesSecondly the value of the content that we put upincreased as more users put up content of their ownbecause the additional content attracted more usersThirdly to benefit you did not need to generate a lotof content a very little would do and you couldincrementally add more later Slowly heritage insti-tutions found ways to take advantage of the oppor-tunities offered by the web there are still many smalland medium sized heritage institutions that have not
Indeed the heritage sector is likely to be leftbehind because the financial rewards for creating
the mark-up necessary to make the Semantic Web areality are only evident to the commercial sectorThere can be little doubt that the access to andunderstanding of the heritage would benefit from aworld in which the vision of the Semantic Web wererealised But this is not the first information techno-logy for which the benefits were promising Evenvery simple strategies such as the use of databases toenable collection description have been shown over aperiod of nearly thirty-five years to bring benefits tothe heritage sector institutions through better know-ledge about care of and access to their collectionsIn the ALM sector only libraries can be said to havefully taken advantage of the technology to describetheir collections and even here a close look showsthat this has not covered all their holdings and notevery institution For instance few libraries in theUK have online catalogues of their pre-1700 itemsand almost none have accurately described theirphotographic holdings at anything deeper thancollection levelThe same can be said of museumswhere descriptions are limited except of course atthe major institutions In 1997 a survey in theUnited Kingdom showed that small and medium-sized institutions were struggling to participate in thecomputer-based description of their holdingsThiswas even before they considered putting the outputof those holdings online I would argue that thisshould hardly be surprising as the heritage sectorhas already been left behind in the developmentof online information in the web-worldToo fewinstitutions have too little visible content that isactually usable If the heritage sector is to make anear term contribution to the development of theSemantic Web it is going to be very moderate It isvery unlikely that developments will be related toreasoning about the heritage in the ways consideredby Amann et al (2002)The ALM sector is more
than likely to participate in the development of theSemantic Web through the creation of semanticmark-up of information about access arrangementssuch as opening hours and details of facilitiesThisinformation is more likely to be useful to the tour-ism agent described by Tim Berners-Lee in his 2001Scientific American articleWhile this may be a verypositive way of integrating the heritage into thesemantic web it does not maximise the potentialbenefits
The days when a curator who wishes to hold anexhibition on the representations of Salome since the15th century will be able to lsquoloadrsquo an agent with therequest to identify select negotiate the loan of andarrange the transportation of the key 100 works ofart are a long way offThe fundamental descriptionsof holdings are not currently available where theyexist they are not online and certainly have not beensemantically encoded to make them usable by ourSalome agent For those who have worked onKnowledge Representation the vision of theSemantic Web holds promise Knowledge Repre-sentation is hard especially if you intend any parti-cular representation to be usable by others either as adecision making resource or as for research purposesEfforts in the 1980s and early 1990s such as those in archaeology failedThe reasons KnowledgeRepresentation failed to achieve its promise rangedfrom the poor quality of knowledge extractionstrategies the lack of fundamental representationmethodologies the limited applicability of methodsto knowledge domains the problems of boundaryconstraint and creep to the high costs of developingapplicationsThe Semantic Web could breathe newlife into this earlier promise by providing ways tocarve up the problem while bringing us immediatesuccesses
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
DigiCULT 11
CONCLUSION
Over the next five years the possibilities offeredby the Semantic Web will bring little near term
benefit for the heritage sector unless that sector co-ordinates its efforts to ensure that the fundamentalbuilding blocks that are necessary for the SemanticWeb to be a success are put in placeWe need somequick winsA quick win involves identifying adomain that every institution can be encouraged torepresent semantically and the placing in the publicdomain of a lsquopersonalisable agentrsquo that can takeadvantage of these semanticsThree factors couldunderpin a quick win (a) a narrowly restrictedknowledge domain of real public value (b) anaccessible and narrow ontology and (c) apersonalisable tool for processing knowledge
Ultimately the same factors that constrain theheritagersquos sectors ability to take full advantage of theWeb will constrain the penetration and pervasivenessof the Semantic Web in the heritage sectorThesuccess of the Semantic Web in the heritage sectordepends upon its adopting a XML based approachand a significant experiment that demonstrates itsbenefits to the wider community Even for all itsweaknesses the Semantic Web offers a tantalisingsolution to the problem of information overloadcreated by the web and the heritage sector needs toaddress how it can take advantage of theopportunities it offers
Amann B Beeri C Fundulaki I and Scholl M 2002Ontology-Based Integration of XML Resources inI Horrocks and J Hendler (eds)The Semantic Web - ISWC 2002 Berlin Springer 117-131
Berners-LeeT Hendler J and Lassila O 2001The Semantic Web in Scientific American May 2001
Candan KS Liu H and Suvarna R 2001 Resource Description Framework Metadata and Its Applicationin SIGKDD Explorations 31 6-19 httpwwwacmorg sigssigkddexplorationsissue3-1candanpdf
DoanA Madhavan J Domingos P and Halevy A 2002Learning to Map between Ontologies on the Semantic Web in Proceedings WWW2002 7-11 May 2002 (Hono-lulu) 662-673
Goacutemez-PeacuterezA and Corcho O 2002 Ontology Languages for the Semantic Web in IEEE Intelligent Systems January February 2002 54-60
Haustein S and Pleumann J 2002 Is Participation in the Semantic Web Too Difficult in I Horrocks and J Hendler (eds)The Semantic Web -ISWC 2002 Berlin Springer448-453
Heflin J and Hendler JA Portrait of the Semantic Web in Action in IEEE Intelligent Systems MarchApril 2001 54-59
Hendler J 2001Agents and the Semantic Web in IEEE
Intelligent Systems MarchApril 2001 30-37Hendler J Berners-LeeT and Miller E 2002 Integrating
Applications on the Semantic Web in Journal of the Institute of Electrical Engineers of Japan 12010 676-680
Klein M 2001 XML RDF and Relatives in IEEE Intelligent Systems MarchApril 2001 26-28
McGuinness D L Fikes R Hendler J and Stein LA 2002DAML+OILAn Ontology Language for the Semantic Webin IEEE Intelligent Systems SeptemberOctober 2002 72-80
Patel-Schneider P and Simeacuteon J 2002a Building the Semantic Web on XML in I Horrocks and JHendler (eds)The Semantic Web ndashISWC 2002 Berlin Springer 147-161httpwww-dbscsuni-sbdelehress03xml-seminar MaterialPS02pdf
Patel-Schneider P and Simeacuteon J 2002bThe YinYang Web XML Syntax and RDF Semanticsrsquo in Proceedings WWW2002 7-11 May 2002 (Honolulu) 443-453
RenearA Dubin D Sperberg-McQueen CM and Huitfeldt C 2002Towards a Semantics for XML Markupin DocEngrsquo02 8-9 November 2002 (McLeanVA)ACM Publication
Shah U FininT JoshiA Cost R S and Mayfield J 2002Information Retrieval on the Semantic Web in CIKMrsquo024-9 November 2002 461-468
Bibliography
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
12 DigiCULT
DEVELOPMENT OF THE SEMANTIC WEB
MUST BEGIN AT THE GRASS ROOTS LEVEL
AN INTERVIEW WITH JANNEKE VAN KERSENDUTCH DIGITAL HERITAGE ASSOCIATIONTHE NETHERLANDS
By Joost van Kasteren
T o be successful the Semantic Web for thecultural heritage sector will have to developfrom the grass roots levelA top-down
approach whereby institutions have to squeezethemselves into a certain format is not going toworkrsquo Janneke van Kersen has strong views on theinitiatives that are currently being undertaken todevelop a Semantic Web lsquoThey are not going towork for the cultural heritage institutions if you donot take into account the position that they are inEspecially not if the institutions are forced tooverhaul their digitisation projects completelyrsquo
Kersen graduated in Art History and did apostgraduate course on Historical InformationProcessing Since 1999 she has been a consultantwith the Dutch Digital Heritage Association(Vereniging DEN) which supports cultural heritageinstitutions large and small in developing strategiesto face the digital future
The key objectives of the DEN are to assistinstitutions in digitising and documenting theircollections according to high quality standards andassuring cross-domain and cross-institutional access toheritage information in a context-rich structuredenvironmentThe methods used to realise theseobjectives are knowledge dissemination best practiceand standardisationThe Association propagates openstandards like XML OAI and Dublin Core(qualified)
The DEN has approximately 60 memberinstitutions among them most of the large heritageinstitutions of The Netherlands It provides access tothe databases of the member organisations through
the portal httpwwwcultuurwijzernlTheCultuurwijzer (culture pointer) to the collectionsuses the Aqua Browser to search for terms in a non-hierarchical associative way Databases can also beaccessed through subject fields based on the DublinCore standard Research is carried out to apply theArt and Architecture Thesaurus in a post-coordinativeway using it as an additional search aid
Kersen lsquoThe Dublin Core has some drawbacks butit is one of the few international standards availablefor exchange of information Mapped to the 5 Wswho where what when and why it turns out to be anice tool for interoperability across the databases ofheritage institutions Of course we have to accept acertain kind of fuzziness and lack of precisioncompared with domain-specific access at theinstitutional levelrsquo
According to Kersen a real Semantic Web is still along way off lsquoWe simply do not have the tools yetfor a meaningful exchange and representation ofinformation XML and RDF do not provide theinteroperability that is needed On the other hand Ido not believe in developing a fundamental ontologyto give meaning to information on the Net It looksto me like the 18th-century endeavour to write anencyclopaedia that contains all the knowledge in theworld I am afraid it does not work that wayA lot ofknowledge even scientific knowledge cannot bedescribed in a logical way Especially in the arts a lotof ldquoknowledgerdquo is the result of heuristics andassociative thinkingApart from that there is thepractical problem that cultural heritage institutions donot have the money and the staff to describe their
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
collections anew in a way that fits the ontologyThatis just too much workrsquo
Kersen thinks the Semantic Web will growgradually from the grass roots level onwardWithinthe Dutch Digital Heritage Association she can pointat several initiatives Considered on their own theymight not seem like much ie in relation to thegargantuan task of developing a Semantic Web butwhen they are combined a certain pattern begins toemerge
There is for instance a project SitecheckerRDFwhich will look into ways in which RDF can beused for describing content for Web-based deliveryFurthermore several standardisation projects arerunning that enable the participating institutions todevelop the description of their specific knowledgedomain eg graphic domain religion art historyTheformal and semantic mapping schemes used in theseprojects will include Dublin Core Encoded ArchivalDescription (EAD) IMS Learning Resources Meta-data SpecificationArt amp Architecture Thesaurus aswell as the CIDOC reference model Kersen lsquoAt themoment most of the reference terms are developedat the level of institutions which means their use islimited to a certain domain Or to put it anotherway every domain is developing its own dialectMaybe the development of a combined referencescheme will be a step towards a Semantic Webrsquo
Another important project is the development ofa scheme for description at the collection level inorder to offer a clearer and more hierarchical accessto heritage collectionsThis project has its roots inthe Dutch project for collection level description
MUSIP (Museum Inventarisation Project)Thedescription scheme will be broadened to make itavailable for other heritage institutions as wellfor example archives
The description and results of these projectsand the programme lines will be made availableon the Web site of the Vereniging DENhttpwwwdennl As Cultuurwijzer is used as aproof of concept the results will be directlyaccessible at httpwwwcultuurwijzernl (Kersenkindly invites interested parties to put questionsdirectly to their organisation)
Projects are always carried out in co-operationwith the member organisations Kersen lsquoWe first tryit ourselves until we are sure it works a ldquoproof ofconceptrdquo you could sayThese tests are overseen bya small group of automation experts working forour member organisationsThe next step is to testthe method on a larger scale with some of ourmember organisations A larger working groupoversees these tests Only then is the methodreleased to our member organisationsTheadvantage is that smaller member organisations canride on the experience of the larger ones By takinga step-by-step approach we also enhance the level ofcommitmentYou could say we are providing someorder in the information chaos that exists on theInternet A few small steps along the long roadtowards the Semantic Webrsquo
Vereniging Digitaal Erfgoed Nederlandhttpwwwdennl Cultuurwijzer httpwwwcultuurwijzernl
DigiCULT 13
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
14 DigiCULT
I t was really very kind of the Moderator Heasked the cultural heritage experts lsquoWhat isthe message that we should give our scientific
writer who will do a write-up of this meeting forthe Thematic IssuersquoTheir inclinations had not alwaysbeen clear But the question focused minds and theywere certain now
lsquoI would put my money on the Semantic Webrsquosaid two of them not quite in unison lsquoThe SemanticWeb is a direction it is like NorthYou go Northbut you never arrive and say ldquohere it isrdquoThis is theSemantic Webrsquo said another
The course of the debate at the DarmstadtDigiCULT Forum had not always been so directIt had started with the Position Paperrsquos dismayingthought that lsquothe limited understanding of infor-mation processing in the heritage sector almostmakes the Semantic Web an impossibility to applyrsquo
It had touched on the semantics of Simeon poetryart works of the biblical seductress Salome weatherforecasts for the northern English city of York andthe revolutionary theories of 16th-century Italian
DIGICULTrsquoS EXPERT 13 TANGLE
WITH THE SEMANTIC WEB
By Michael Steemson
Genesis ndash The Creation Birds and Fishes
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
astronomer Galileo But the experts agreed finallythat the cultural heritage sector needed the SemanticWeb and that a good deal of education andguidance would be required to make it appreciatethat need
The experts numbered 13 lucky for some at thisthe third DigiCULT Forum of the European Unionrsquostechnology watchdog for cultural and scientificheritage institutions In the previous 12 monthsother forum groups had discussed authenticity andintegrity for digitisation programmes and laterdigital asset management systems Now theDarmstadt 13 - historians language and informationtechnology scientists academics and publishers - werelooking even further down the information autobahnto the vision of WWW inventor Englishman TimBerners-Lee who sees a new kind of automated Webthat learns and understands each userrsquos particularrequirements and delivers complete reliable testedinformation sets
In a co-authored May 2001 Scientific Americanarticle1 Mr Berners-Lee imagines a family facing thehorrors of re-scheduling its lives around a motherrsquosunexpected illnessThe sons and daughters rely onSemantic Web lsquoagentsrsquo small executable Web filesto search online medical records hospital bed liststransport timetables doctorsrsquo appointment booksroad condition reports and home diaries to findtreatment plan travel and re-arrange personalengagements to fit the emergency
The vision requires huge world-wide investmentin time and effort creating countless lsquoontologiesrsquocontaining perhaps XML (eXtensible Mark-upLanguage) and RDF (Resource DescriptionFramework) data to which the electronic lsquoagentsrsquocould refer for understanding before applying tospecially formatted Web pages for the information
The Berners-Lee et al dazzling forecast is lsquoTheSemantic Web will enable machines to comprehendsemantic documents and data not human speech andwritings Properly designed the Semantic Web canassist the evolution of human knowledge as a wholersquo
THE DAZZLING PROSPECTS
Dazzling it is and the Darmstadt 13 were attractedBut they were not blinded Moderator Dr SeamusRoss the Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) suggested that while the currentWWW content got a lot of lsquobangrsquo for its develop-ment dollars the Semantic Web needed hugeexpensive content before it could work well
Application of the Berners-Lee ideas to culturalheritage use was a long way off he thought andwondered lsquoIs there enough benefit from theSemantic Web in the near term to make it arealisable dream 50 years down the roadrsquo
Italian National Research Council AppliedOntologies Laboratory director Nicola Guarino hasbeen working on the subject for 12 years and heknows the difficulties He said lsquoThis is the ideal viewwhich Tim Berners-Lee has machines which workfor you your proxy which works for you perform-ing these dynamic connections for the Web whichpreserve meaning It is pretty ambitious but this ishis idea I would be happier if rather than using anautomatic proxy we could just let people establishthese dynamic connections using their brain and theWebThis is already something that is not easilydonersquo
Austriarsquos Wernher Behrendt had encountered othersnagsAt Salzburg Research the secretariat for theDigiCULT Forums he co-ordinates anotherEuropean Commission IST project CULTOS(Cultural Units of Learning - Tools and Services) Heconceded lsquoThere is a 50-year research vision behindthe issue of the Semantic Webrsquo and went on lsquobutthere are incremental steps that with good utility canbe built in a reasonable time One of the intellectualchallenges is to break the vision into thesemanageable stepsrsquo
LANGUAGE REPRESENTATION
HITCHESA CULTOS group had Behrendt explained taken
one of these incremental steps and built an ontology2
for digitised works of art It had encounteredproblems with language representation like lsquoAre theretools to support knowledge representation languageAre the users then actually able to work usefully withthat Can we incorporate the multimedia authoringcomponent where people who have not built the
DigiCULT 15
1 T Berners-Lee J Hendler O
LassilaThe Semantic Web In
Scientific American May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml2 Ontology n 1 PhilosophyThe
branch of metaphysics that deals
ith the nature of being 2 Logic
he set of entities presupposed by
a theory Collins English Dictio-
aryThird edition Glasgow 1991
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
16 DigiCULT
Cultural Units of Learning ndashTools and Services (CULTOS)CULTOS is an RTD project co-funded by theEuropean Commission under the InformationSociety Technologies (IST) Programme which willrun until October 2003The application domain ofCULTOS is intertextual studies in literature and artsThe project is developing a multimedia authoringand presentation environment that allows scholars tomake the different relationships between culturalworks explicit in a way that approximatescontextualisation in interpretative processesTheresult of the authoring processes is multimediaobjects called lsquointertextual cultural threadsrsquoTheseare based on lsquoEMMOsrsquo a novel type of structuredmultimedia object containing expert knowledgeconforming to current and emerging standardssuch as XMLSMIL (with interactive extensions)MPEG-7 and RDFhttpwwwcultosorg
ontology themselves should then use it to combinemultimedia assets with each otherrsquo
The Dutch had doubts too Dr Janneke van Kersenis an art historian with Digital Heritage Netherlands(Digitaal Erfgoed Nederland httpwwwdennl)where an XML-based content management systemis being combined with a Resource DescriptionFramework (RDF) to join databases from severalcultural heritage institutions She told the expertslsquoI need to be assured that we will be able to build alayered structure that is equally applicable to eachknowledge domain Furthermore I think that thecultural heritage sector is too much of a nichemarket to develop thisrsquo
Her countryman Dr Frank Nack from thenational research institute for mathematics andcomputer science CWI (Centrum voor Wiskundeen Informatica httpwwwcwinl) in Amsterdamworks with a multimedia and human computerinteraction group His concern lsquoOur group believesin the Semantic Web but we needed somemechanisms to structure the information so thatvarious groups can work with itWhat we came upwith was the belief that you can classify the user ata particular time But that was simply not goodenoughrsquo
The group had found that users change theirrequirements widely and these shifts were invisible toa system lsquoHumans can look at material one day and
the next day they look at the same stuff differentlyand describe it differently because they are in adifferent moodrsquo he said He characterised theproblem as lsquoNow I would like to see something formy work and now I want to be entertainedWhichmeans I would like to access the informationdifferentlyrsquo
He had one other worryWebised mixed mediaHe said lsquoThis discussion has been heavily linguisticbased which I can understand because most peopledo still think of the Web as text drivenThe issue ofdescribing various media items that are not text willI think very soon become important for theSemantic WebWe had better start thinking aboutthat toorsquo
THE GALILEO CONUNDRUM
The Institute and Museum of Science Historyin Florence Italy has tried to create an ontologyaround the works and sciences of its cityrsquos famousson the revolutionary astronomer mathematicianand physicist Galileo Galilei (1564-1642)3 But itran into difficulties when it came to the radicalchanges in theory that he created
Institute relational database expertAndrea Scottitold the Forum lsquoGalileorsquos scientific theory negatesanother scientific theoryThis negating or develop-ment of theories was very difficult to represent inthe ontology when dealing with the time factorThat is central to historical documentation butrepresentational time was not part of the processavailable to usrsquo
Dr Costis Dallas the Athens chairman of theEuropean communication and technology groupCritical Publics thought the Florence museumrsquosproject was very ambitiousThe time argumentwas difficult because lsquoof course it isnrsquot possible torepresent time properly within a relational databasersquoBut there were mechanisms - he mentioned softwareby the Virginia US IT group Telos4 - that betterrepresented issues of time
Italian National Research Councilrsquos NicolaGuarino chipped in lsquoThe CIDOC5 reference modelshave partial answers to these questionsrsquo
Dr Dallas went on lsquoI do not believe that the wholeexercise is futile but we found that in practice youcannot make a subject language for everybody It hasto be for a community of users If you provide themwith a richer representation for instance if they canknow that this is a person and this person lived ina place then users will have a much richerexperiencersquo
3 The museum and Web site
are rich resources for the
life and work of Galileo
httpgalileoimssfirenzeit4 Telos Corporation
httpwwwteloscom
AshburnVA US5 CIDOC International
Committee for Documentation
of the International
Council of Museums
httpwwwwillpowerinfomyby
coukcidocCIDOCe
(ICOM-CIDOC) Forum for
documentation interests of
museums and related
organisations one of 25
international committees
of the International Council
of Museums (ICOM)
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
CAPITALS AND ACRONYMS
The Forum moved on to a discussion of thescience and software behind the Semantic WebModerator Seamus Ross started by questioningwhether lsquokey thinkersrsquo were lsquomissing a fundamentalpoint that Web pages are dead that database-drivenWeb pages are the future that people are going tostop making Web pages and make databasesrsquo
He recalled that James A Hendler co-author withTim Berners-Lee in the Semantic Web paper and aprofessor in the Department of Computer Science atthe University of Maryland had recommended seman-tic representation as part of any Web pages Dr RosscommentedlsquoThis notion of the Semantic Web is notgoing to work with these databases Is this true or notrsquo
Bert Degenhart-Drenth the managing director ofNetherlands ADLIB Information Systems(httpwwwnladlibsoftcom) thought it was trueBut it was a practical problem that would be solved
DigiCULT 17
Open Archives Initiative (OAI) Protocol forMetadata HarvestingThe Open Archives Initiative (OAI) is a mainly US-based group of people and organisations that evolvedout of a need to increase access to scholarly publi-cations through interoperable digital repositoriesSupport for the OAIrsquos goals comes from the DigitalLibrary Federation the Coalition for NetworkedInformation and from a NSF Grant One of its majorachievements is an application-independent inter-operability framework based on metadata harvestingthe OAI Protocol for Metadata HarvestingThe OAI Protocol is based on the standard Webprotocols http and XML and employs Dublin Core(unqualified) as metadata standard Heritage organi-sations who have systems that support the OAIProtocol can expose metadata about the content intheir repository ie allow service providers to harvestthe data for services such as search engines In theOAI Protocol the XML schema is used at two levelsto define the format of responses to all OAI Protocolrequests and to define the format of metadata streamsembedded in the GetRecord and ListRecordsresponses In both cases the goal is to provide amechanism for data validationhttpwwwopenarchivesorgThe OAI-PMH version 20 released in June 2002can be found at httpwwwopenarchivesorgOAI20openarchivesprotocolhtmSee also John PerkinsA New Way of MakingCultural Information Resources Visible on the Web
Museums and the Open Archives Initiative Museumsand the Web Conference 2001httpwwwarchimusecommw2001papersperkins
Open Archives Forum Project httpwwwoaforumorgThe Open Archives Forum Project is a two-yearFifth Framework Programme IST accompanyingmeasure that will run until September 2003TheForum is building a Web-based database on OAI-related projects software implementations andservices and supports the information exchangebetween OAI user communitiesTheir surveys provide good insight into the status ofuptake of the OAI in Europe For an overview of theresults see S Dobratz B Matthaei Open ArchivesActivities and Experiences in EuropeAn Overviewby the Open Archives Forum In D-Lib MagazineVol 9 No 1 January 2003 httpwwwdliborgdlibjanuary03dobratz01dobratzhtmlRecently one of their workshops lsquoProviding Accessto Hidden Resourcesrsquo (Lisbon December 2002)targeted the libraries and archives communitiesRequirements standards best practice and solutionsto interoperability problems of these communitieswere analysed and compared with the featuresprovided by the OAI Protocol for MetadataHarvestingThe Tutorial lsquoOAI and OAI-PMH forBeginnersrsquo and other presentations can be found athttpwwwoaforumorgworkshopslisb_programmephp
by projects like the Web Services of the OpenArchives Initiative (OAI)
Several members spoke of difficulties created bydynamic Web pages generated from ASP databasesPaul Miller the UK Interoperability Focus for theUniversity of Bathrsquos UKOLN (formerly the UKOffice for Library Networking) project said the UKrsquos
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
18 DigiCULT
Office of the e-Envoy6 the agency in charge ofBritainrsquos e-Government programme drove it with aWeb site called Govtalk7 lsquoOne big ASP databasersquo hecalled it It carried the Governmentrsquos interoperabilitystandard framework with a URL of lsquo a great biglong string something something dot ASPrsquoThat ishow this key piece of legislation is referred to andnext week it will be something elsersquo
Dr van Kersen thought the new Web OntologyLanguage with the transposed acronym OWL wouldhelp Italian ontology researcher Nicola Guarinodiscussed the European Commissionrsquos On-To-Knowledge project its RDF tool Sesame8 and theOntology Inference Layer (OIL) language
Dr Ross described the US Defense DepartmentrsquosDARPA (Defense Advanced Research ProjectsAgency) Mark-up Language (DAML) programmeand its DAML+OIL variant Mr Degenhart-Drenthhighlighted the importance of protocols used in WebServices such as SOAP (Simple Object AccessProtocol) and UDDI (Universal DescriptionDiscovery and Integration) and also pointed to aSPECTRUM XML standard for museums that hehelped writeThen there was the moviemakersrsquo audio-
visual search standard MPEG-7 and the SMIL(Synchronized Multimedia Integration Language)No one mentioned SHOE (Simple HTML OntologyExtensions) which was surprising as the discussionbecame more and more alphabetic and upper case
ONTOLOGY TUTORIAL IN 800 WORDS
Nicola Guarino brought the discussion back ontrackThe ontology expert delivered a fascinatingimpromptu 800-word dissertation on ontologygenetics
Ontologies he said started because it was realisedthat controlled vocabularies which worked wellenough for limited periods needed something extrato make them really usefulThey needed clarificationof intended meaningThis could be achieved inmuch the same way as dictionaries did it by refer-ence to other more basic termsThis was he saidthe key point
lsquoOntologies can work if the basic terms arereally used in a principled wayThere is a hiddenassumption here that it is indeed possible to expressthe meaning in terms of a relatively small set ofprimitive termsrsquo
He explained further lsquoThere are general terms thathave a universal meaningTake the term ldquopartrdquo forinstance or ldquosetrdquo Or take temporal relationsldquobeforerdquo Suppose I have two different periods theRenaissance and another period and suppose I saythat this period comes ldquobeforerdquo the other one do youexclude the case whereby the two periods overlap ornot This is just a matter of stipulation this is ageneral term that is not domain specific I can simplystipulate exactly whether the ldquobeforerdquo relationshipbetween the two intervals includes the case ofoverlapping or notAnd I can do that by means ofaxioms Once you clarify the meaning of the basicterms topological relations mereological9 relationsdependence relations these kinds of things then youhave the basic vocabulary that helps you to intro-duce more domain-related thingsAnd this is whatpeople are doing in the area of what are calledldquoFoundational Ontologiesrdquo and this is what I amdoing I believe this is the only way to solve theproblem of semantic interoperability So not justcontrolled vocabularies but vocabularies that areformally defined in minimal termsrsquo
Now it was clear but would it be available toheritage institutions Dr Ross wanted to know iffundamental ontologies of use to the heritage sectoralready existedWhat would they be Before any
On-To-KnowledgeOn-To-Knowledge is an IST RTD project that wascompleted in June 2002The project developed toolsand methods for supporting knowledge managementin large and distributed organisationsThe technicalbackbone of On-To-Knowledge was the use ofontologies for the various tasks of informationintegration and mediation For the projectrsquos manyresults see their tools repository project deliverablesand publications athttpwwwontoknowledgeorgpubshtmlSee also the On-To-Knowledge book lsquoTowards theSemantic Web Ontology-driven KnowledgeManagementrsquo J Davies D Fensel F van Harmelen(eds) John Wiley December 2002
6 Office of the e-Envoy
httpwwwe-envoygovuk 7 Govtalk
httpwwwgovtalkgovuk8 Sesame environment
httpwwwontoknowledgeorg
toolsfactsheetSesamehtml 9 Mereology nThe formal study
of the logical properties of the
relation of part and whole
Collins English Dictionary
Third edition Glasgow 1991
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
development could begin basic classifications andmethodologies would be required to form a foun-dation for the work
Italian online publisher Marco Meli a managerof the EUrsquos MESMUSES (Metaphors for ScienceMuseums) project insisted lsquoYou need a clear defi-nition in this particular domainWhat are the keywords the termsrsquo
Mr Guarino answered lsquoThe key concepts and thekey relationsrsquo
THE ELUSIVE SEMANTIC GRAIL
By now the Darmstadt 13 were beginning torealise they were not having much luck with theirsearch for the Semantic GrailTheir discussionbecame a little tetchySomeone talked about lsquometa-ontologyrsquo and another growled lsquoThe term ldquometardquo is abusedrsquolsquoOutdated technical optimismrsquo was mentionedlsquoWhat is your alternativersquo someone else wantedto knowlsquoI donrsquot have onersquolsquoIs this the way forwardrsquo
DigiCULT 19
MESMUSES -Metaphors for Science MuseumsMESMUSES is an IST RTD project that will rununtil July 2003 It aims at designing a general methodand supporting tools to produce knowledge mapsfor use in self-learning environments of sciencemuseums In the project a knowledge map is definedas a set of related concepts and facts that is offered tolearners with some guidance or suggestions onpossible itineraries that they may follow to explorethe knowledge spaceThe method and tools developed in MESMUSESare being tested and validated by two large sciencemuseums the Citeacute des Sciences et de lrsquoIndustrie inParis and the Istituto e Museo di Storia della Scienzain Florence which provide access to their digitalcatalogues Both museums are developing knowledgemaps and itineraries on different themes in Biology(Genome) and Physics (Galileo and the laws ofmotion)Project Web sitehttpcwebinriafrProjectsMesmusesSee also M Meli Knowledge Management a newchallenge for science museums In Cultivate InteractiveIssue 9 7 February 2003 httpwwwcultivate-intorgissue9mesmuses
lsquoWhat part of the problem would that solversquolsquoWe will know when we have tried that outrsquolsquoHere we are approaching a scary fieldrsquo
Civility and peace were restored as Frank Nackthe CWI Netherlands scientist introduced thethought lsquoThere are ontologies for art and they arevery old and well craftedThere are very clear rulesabout why they did what they did because they haveworked on them for a thousand yearsWhat youcould suggest is that we strip down to the basics forone field say art and apply it to all the other fieldswe have in cultural heritage architecture filmwhatever all working with very different substancesrsquo
Seamus Ross added lsquoSo we need one fundamentalontology on which we can build all the othersrsquo
THE LUCK CHANGES
The luck of the 13 was beginning to changeTheAthenian heritage informatics expert Dr Dallasdescribed work among his company clients ondeveloping lsquoan upper ontologyrsquoAll the issues theForum had been discussing what to do about time abasic concepts process agents and so on were beingexaminedThey were beginning to developlsquosomething very much like a thesaurusrsquo with termexpansion that created sub-categories of relationshipsHe called it lsquogeneric layeringrsquo a process that couldidentify the lsquogeneric grammarrsquo of relationships withina specific domain - art history for example
lsquoThis is usefulrsquo he said lsquoThis way we can createWeb systems that present an association of contentfor users that is meaningful to them Letrsquos say aldquoculturalrdquo meaningrsquo
Nicola Guarino went further He believed that theInternational Council of Museumsrsquo Committee forDocumentation (CIDOC) Conceptual ReferenceModel (CRM) was the lsquobest starting pointrsquo for theheritage communityThe CIDOC CRM is the resultof 10 yearsrsquo work by a standards working groupThe
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
20 DigiCULT
model is under review for adoption as an Inter-national Standards Organisation (ISO) publication
Mr Guarino was enthusiastic lsquoI am not biased onthis I am a reviewer on the CHIOS project thatsupports this proposal for an ISO standard and I amamazed by the fact that this standard is more or lessprincipled On the one hand the authors really striveto get these principled things and on the other handthey have extensively accounted for existing practiceIt is the result of a large community of work It isnot perfect but it really is a starting point for thiscommunityrsquo
He affirmed Dr Rossrsquos delighted questionlsquoSo this is an ontology we can borrowrsquo
And the Italian expert had more to add lsquoThequestion before was ldquohow can we be sure that theprincipled things can really solve our problemsrdquoI do not have a crisp answer but I do have someevidence that even a tiny result on the foundational
side has a high pay-off So you do not need to solveall the foundational issuesrsquo
The distinctions between an object and its roleor individual and classes of items were delicate butonce understood could lead to significant dataimprovement he said adding lsquoTake for instancethe distinction between object and eventThis isonly one tiny distinction but it is so fundamentalthat once you understand it you can save time indeveloping your own application ontologyTinyconceptual progress does have a high pay-offThisis why I believe it is usefulrsquo
There were still one or two doubts but CULTOSproject co-ordinator Wernher Behrendt tidied upwith a daring stance lsquoLet me be a heretic for asecond How many of us have an operating systemother than Windows What I am saying is thatstandardisation often helps Even if it is not the beststandard it does help get people working togetherIt is perfectly fair to define a standard for the worldnowThere will be a lot of discussion but it will winddown to a few constructs It is a better method ofgetting an ontology accepted than having ontologiesmushrooming all over the place that must then beintegratedrsquo
The Darmstadt 13 were pleasedThey had amodelThey had lsquoWeb Servicesrsquo stepping stones towork acrossThey werenrsquot going to fall into the trapof insisting just yet that the Semantic Web wasimportant for the heritage sector but they wantedan education process for unconvinced curators
They needed someone to make the first ontologymove Seamus Ross suggested asking the J Paul GettyTrust (httpwwwgettyedu)They needed auto-mated tools for testing ontologiesThey did not wantto be delivered into the entertainment industry butsaw benefit in what University of Florence AssociateProfessor Franco Niccolucci called lsquoculturalentertainmentrsquo
So where would the experts put their moneyasked Mr Behrendt lsquoOn there not being any benefitsin the Semantic Web for the cultural heritage sectoror there being some benefits in building such thingswhatever they may bersquo
Amsterdamer Frank Nack was in no doubt lsquoIt isgoing to happen It will probably look very differentfrom how we imagine it right now but it is goingto happenrsquo His countrymanADLIB chief BertDegenhart-Drenth thought so too lsquoWe have putour money thereAll our applications work withXML rsquoAnd Dr van Kersen agreed lsquoI would putmy money on the Semantic Webrsquo
That seemed to make it game set and match
CIDOC CRM lsquoThe Semantic GluersquoThe lsquoCIDOC object-oriented Conceptual
Reference Modelrsquo (CIDOC CRM) was developedby the ICOMCIDOC Documentation StandardsGroup Since September 2000 the CIDOC CRM isbeing developed into an ISO standard
lsquoThe CIDOC CRM is intended to promote ashared understanding of cultural heritage informationby providing a common and extensible semanticframework to which any cultural heritageinformation can be mapped It is intended to be acommon language for domain experts andimplementers to formulate requirements forinformation systems and to serve as a guide for goodpractice in conceptual modelling In this way it canprovide the semantic glue needed to mediatebetween different sources of cultural heritageinformation such as that published by museumslibraries and archivesrsquo
httpcidocicsforthgrwhat_is_crmhtml
CHIOS - Cultural Heritage InterchangeOntology Standardization project
Since June 2001 the work of the CIDOC CRMSpecial Interest Group has been supported byCHIOS a two-year project which receives fundingfrom the Fifth Framework IST ProgrammeTheCHIOS consortium forms an integral part of theCIDOC CRM Special Interest Group which byorganising shared meetings represents the interestsand requirements of the cultural heritage communityto the ISO Working Group (ISOTC46SC4WG9)
httpcidocicsforthgrchios_isohtml
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Annotation amp AuthoringlsquoHow to annotate and have funrsquohttpannotationsemanticweborgfaqfaq2and other papers See alsohttpannotation semanticweborgtools Institutefor Applied Informatics and Formal DescriptionMethods University of Karlsruhe Germany
The Annotea Project a lsquoLive Early Adoption andDemonstration (LEAD)rsquo project of the World WideWeb Consortium (W3C) collaboration environmentwith shared annotationswwww3org2001Annotea
CIDOC CRMThe CIDOC Conceptual Reference Model was animportant reference point in the Forum discussion(see also the information box on the CIDOC CRMin the Summary) CIDOC the Comiteacute Internationalpour la Documentation is part of the InternationalCouncil for Museums (ICOM) Its Web site can befound at httpcidocicsforthgrA report on CIDOCacutes work on the CRM isprovided in lsquoThe CIDOC Conceptual ReferenceModelA Standard for Communicating CulturalContentsrsquo by Nick Crofts Martin Doerr and Tony Gill in Cultivate Interactive Issue 9 February 2003
httpwwwcultivate-intorgissue9chiosSee also their tutorials athttpcidocicsforthgrtutorialshtml
DAML - DARPA Agent Markup Language The Defense Advanced Research Projects Agency(DARPA) is the central research and developmentorganisation for the US Department of DefenseIts DAML programme is developing a languageand tools to facilitate Semantic Web conceptshttpwwwdamlorg
lsquoWhy Use DAMLrsquoAdam Pease white paperTeknowledge 10 April 2002httpwwwdamlorg200204whyhtml
DAML+OILDAML+OIL is a product of the DARPA JointUnited StatesEuropean Union ad hoc AgentMarkup Language CommitteeThe committeecreated a language with the best features of SHOEDAML OIL and several other markup approachesIt is a Web ontology language (latest release March2001) expected to provide a basis for future Webstandards for ontologies Seehttpwwwdamlorg200103daml+oil-indexhtml
DigiCULT 21
Genesis ndash The Creation Stars and Fishes
SEMANTIC WEB TERMS
AND READING LIST A-XCompiled by Guntram Geser and Michael Steemson
I n the Summary we have provided informa-tion boxes on projects but not on the manySemantic Web standards technologies etc
mentioned in the Forum discussionThe projectsincluded are only a small fraction of many ongoingactivities and are related mainly to the cultural andscientific heritage community Links to manyimportant Semantic Web development projectscan be found at their community portalhttpwwwsemanticweborg
The following guide points to resources andreadings on terms mentioned in the ForumSummary It is not intended to provide acomprehensive list of Semantic Web materialsRather it represents different entry points andlevels to this topic
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
22 DigiCULT
OIL - Ontology Inference Layer OIL is a proposal for a Web-based representationand inference layer for ontologies It uses a layeredapproach to defining a standard ontology languageEach layer adds functionality and complexity to theprevious layerThis is done such that machines thatcan only process a lower layer can still partiallyunderstand high-level ontologies Seehttpwwwontoknowledgeorgoil
A white paper on OIL functions lsquoAn informaldescription of Standard OIL and Instance OILrsquo bythe On-To-Knowledge group led by Department ofComputer Science University of Manchester UK28 November 2000 wwwontoknowledgeorgoildownloil-whitepaperpdf
OntologiesOntology research Laboratory for Applied Ontology(LOA) Institute of Cognitive Sciences and Techno-logy (ISTC)httpwwwladsebpdcnritinforontologyontologyhtml
Dieter Fensel OntologiesA Silver Bullet forKnowledge Management and Electronic CommerceNew York Springer 2001
lsquoOntology Infrastructure for the Semantic Webrsquoincludes detailed subject bibliographyWonderWebProject Department of Computer ScienceVictoriaUniversity of Manchester UKhttpwonderwebsemanticweborgdeliverablesdocumentsD15pdf
Standard Upper Ontology (SUO) An upperontology for data interoperability informationsearch and retrieval automated inferencing andnatural language processing IEEE Standard UpperOntology (SUO) Working Grouphttpsuoieeeorg
OWL - Web Ontology LanguageThe Web Ontology Language is a semantic markuplanguage for publishing and sharing ontologies onthe World Wide Web OWL is developed as avocabulary extension of the Resource DescriptionFramework (RDF) and is derived from theDAML+OIL Web Ontology Language For thedevelopment of this language see the documentsof the Web Ontology (WebOnt) Working Grouphttpwwww3org2001swWebOnt
See also OWL Web Ontology Language Refer-ence (W3C Working Draft 31 March 2003) athttpwwww3orgTRowl-ref) and OWL WebOntology Language Guide (W3C Working Draft 31March 2003) httpwwww3orgTRowl-guide
For an understanding of the goals requirementsand usage scenarios for a Web ontology language seelsquoWeb Ontology Language (OWL) Use Cases andRequirementsrsquo (W3C working draft 31 March2003) httpwwww3orgTRwebont-req
RDF - Resource Description Framework See Cultural Heritage Semantic Web Example ampPrimer pp 32-34
Semantic WeblsquoThe Semantic WebrsquoTim Berners-Lee with JamesHendler and Ora Lassila Scientific American May2001 httpwwwsciamcom20010501issue0501berners-leehtml
lsquoEnhanced Science and the Semantic Webrsquo JAHendler in Science magazineVolume 299 Number5606 24 January 2003 pp 520-521
lsquoPeer-to-PeerThe Infrastructure for the SemanticWebrsquo Stanford UniversityThe Semantic Web as thenext evolutionary step of the Internethttpp2psemanticweborg
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
The Semantic Web Community PortalSemanticWeborg currently operated by threeresearch groupsThe Onto-Agents and ScalableKnowledge Composition (SKC) Research Group atStanford University the Ontobroker-Group at theUniversity of Karlsruhe Germany and the ProteacutegeacuteResearch Group at Stanford Universityhttpwwwsemanticweborg
The W3C Semantic Web Activity Statementexplains the consortiumrsquos plans in the areas ofenabling standards (driven by the RDF Core andWeb Ontology Working Groups) education andoutreach (RDF Interest Group) as well as co-ordination and advanced developmenthttpwwww3org2001swActivity
See also Kim Veltmanrsquos warning of what he sees tobe too narrow a definition of the Semantic Web onethat will not allow the historical dimension therichness of cultural expression the unique and thediversity of interpretations to be adequately dealtwith Cf KVeltman Challenges for a SemanticWeb (July 2002) httpwwwcultivate-intorgissue7semanticweb
SHOE ndash Simple HTML Ontology ExtensionsSHOE was one of the first ontology-based markuplanguages developed for use on the World Wide WebIt is a small extension to HTML that allows Webpage authors to annotate their Web documents withmachine-readable knowledge SeehttpwwwcsumdeduprojectsplusSHOE
SMIL - Multimedia on the WebThe Synchronized Multimedia Integration Language(SMIL pronounced lsquosmilersquo) enables authoring ofinteractive audiovisual presentations SMIL is typicallyused for lsquorich mediarsquomultimedia presentations whichintegrate streaming audio and video with images text
or any other media type SMIL is an HTML-likelanguage and may be written using a simple texteditorW3C Synchronized Multimediahttpwwww3orgAudioVideo
SPECTRUM-XML DTDSPECTRUM (Standard Procedures for CollectionsRecording Used in Museums) was created by themda (httpwwwmdaorguk) It is a guide to goodpractice for museum documentation that describesprocedures for documenting objects and the processesthey undergo as well as the necessary informationthat needs to be recorded to support the proceduresFor SPECTRUM an XML Document TypeDefinition has been produced which serves as asystem-neutral interchange format for museum data
lsquoSPECTRUMThe UK Museum DocumentationStandardrsquo is available in its second edition seehttpwwwmdaorgukspectrumhtm
For a description of the creation structure anddeployment of the SPECTRUM-XML DTD seeBert Degenhart-Drenth Building on the mdaSPECTRUM-XML DTD for CollectionsManagement Data Interchange Museums andthe Web Conference 2001httpwwwarchimusecommw2001papersdegenhartdegenharthtml
DigiCULT 23
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
24 DigiCULT
Web ServicesAlthough the concept of Web Services is not a newone the definition of Web Services standards allowsthe wider use of these services Currently three mainprotocols are used in the context of Web ServicesUDDI (Universal Description Discovery andIntegration) a registry system to find resources andWeb ServicesWSDL (Web Service DescriptionLanguage) an interface description language andSOAP (Simple Object Access Protocol) thecommunication protocol for Web ServicesAllthree protocols are based on XML
For a description of the many ways in whichXML can enhance Web Services seehttpwwww3org2002wsActivity
World Wide WebLooking back as well as into the future lsquoWeaving theWebThe Original Design and Ultimate Destiny ofthe World Wide Web by Its InventorrsquoTim Berners-Lee with Mark Fischetti Harper San Francisco 1999
XML ndash eXtended Mark-up LanguageSee Cultural Heritage Semantic Web Example ampPrimer pp 27-30
For the work done at W3C within the XMLactivity see XML Working Groupshttpwwww3orgXML the XML Specificationscan be found at httpwwww3orgXMLCore
lsquoThe bane of my existence is doing things that Iknow the computer could do for mersquo ndash DanConnollyThe XML Revolution October 1998httpwwwnaturecomnaturewebmattersxml
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
lsquoThe Semantic Web as it is advocated bypeople like Tim Berners Lee and JamesHendler does not take enough advantage
of the experience built up in knowledge engineeringand conceptual modellingThere is this anarchisticidea of the Web as a place where everyone can do hisor her own thing I have no problem with that a lotof people are able to find what they want on theWeb But if you want real interoperability with searchengines that can grasp the intended meaning of infor-mation that approach falls shortTo create a realSemantic Web we have to develop and use well-founded generic ontologies based on linguisticsand logicsrsquo
Nicola Guarino has clear views on the SemanticWeb and its development He is a senior researcher atthe Institute for Cognitive Sciences and Technologiesin Italy where he leads the Laboratory for AppliedOntology Since 1991 he has played an active role inthe Artificial Intelligence community in promotingthe interdisciplinary study of ontological foundationsof knowledge engineering and conceptual modellingGuarino lsquoIn our Laboratory the focus is on contentand not so much on representationThe use ofontologies is unavoidable when referring to contentPeople do it implicitly all the time when they arecommunicating and trying to understand each otherIf we want machines to understand each other inother words real interoperability we need to makethese ontologies explicit in an unambiguous wayrsquo
An ontology is a hierarchical description of therelations between concepts in a certain domain plusan unambiguous description of the conceptsthemselvesAs they are created for a certain domainontologies often fail to be interoperable because ofthe ambiguity that results from the use of the sameterms for different concepts (and vice versa) betweendifferent domainsThe term lsquonetrsquo for instance hasquite a different meaning for Web designers andfishermenThat is why there is a need for well-founded generic ontologiesAn example of a generic
ontology is the term lsquopartrsquo which can have differentmeanings both within a domain and betweendomains For instance the violist plays a part in theorchestra His finger is part of him Can his finger bepart of the orchestra According to Guarino this is agenuine ontological problem that can only be solvedby giving an unambiguous meaning to the term lsquopartrsquo
Another example cited by Guarino is the termlsquoinrsquoWhat exactly are you describing when you saythe spoon is in the cup Does it mean that the spoonis totally embedded in the cup or is it only partly inthe cup Guarino lsquoThese examples seem trivial butif you want real interoperability between differentknowledge domains you will have to prevent theproblems that come with the ambiguity of day-to-day languagersquo
In this respect Guarino thinks it is a drawback thatcomputer science curricula scarcely ever contain anintroduction to ontological foundations of conceptualmodelling lsquoStudents learn all about Java HTML andC++ and name all the other languages and they alsolearn how to use these But when they graduate theyhardly know a thing about formal ontology I reallythink people should know more about the work onontology that has been done in philosophy It iscertainly not much harder to acquire than saystudying differential equations or learning howto use Javarsquo
It seems as if it is an enormous job to developwell-founded generic ontologies but it is not asenormous a task as it appears Guarino lsquoI would saythat a few dozen would get you on the way nicelyBut you have to take the fundamental routeAt themoment development of the Semantic Web is drivenby the need for short-term results Henceinteroperability is realised by putting the right tagson the informationThat is not what I call semanticsthat is syntax XML and RDF are very useful for thisbut they fall short when you want to create a realSemantic WebrsquoLaboratory of Applied Ontology httpontologyiprmcnrit
DigiCULT 25
SEMANTIC WEB SHOULD BE BASED
ON WELL-FOUNDED ONTOLOGIESAN INTERVIEW WITH NICOLA GUARINOLABORATORY OF APPLIED ONTOLOGY ITALY
By Joost van Kasteren
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
26 DigiCULT
The Semantic Web according to a statementof its well-known advocate Tim Berners-Lee is a vision of lsquoa distributed machine
which should function so as to perform sociallyuseful tasksrsquo1 This machine should allow intelligentsoftware agents to understand semantic relationshipsbetween Web resources in order to seek relevantinformation and perform transactions for humans
Contrasted with the existing human-readable Webthe Semantic Web is envisaged as a Web of machine-readable data that will be based on lsquolanguages forexpressing information in a machine processableformrsquo2 Key to an understanding of the SemanticWeb therefore is how these languages function howinformation is expressed in order that computers canautomatically process Web sources and assist inmaking the Web more useful for humansThe aimof this chapter is to provide an overview of theSemantic Web concept by describing its generalarchitecture ie the interplay of its languages
The chapter has two interrelated parts Part 1describes a Finnish project that strives to build thefoundations for the ldquoFinnish Museums on theSemantic Webrdquo (FMS) a future semantic museumportalThis part consists of the information boxes on the following pages which briefly describe thenecessary elements and steps in the set-up of the FMS system It is recommended to start by readingthis description (see also graphic 3 on page 36 whichprovides an overview of the set-up of the FMSsystem) It should be helpful in gaining a generalunderstanding of how semantic interoperability of andnew ways of interacting with semantically marked-upcultural heritage information can be realised
Part 2 the texts below the information boxes isa primer that explains terms used in part 1 whichrepresent core elements of the Semantic Webarchitecture as well as providing illustrative examplesThe explanations are not intended to give in-depthdefinitions of these elements such definitions areprovided in the relevant W3C specificationsTheexamples have been kept as simple as possible butbuild on each other In this way we will developa (fictitious) Website httpwwwm-iorg thatprovides semantically enhanced access to suchmarvellous medieval images as the ones we haveused to illustrate this Thematic Issue
How to Make Collection Metadata ofMuseums Semantically Interoperable onthe Web ndash The ldquoFinnish Museums on theSemantic Webrdquo (FMS)
The Semantic Web concept is visionary and thereare dedicated people also in the heritage sector whoare trying to make it a reality In our example agroup of researchers and technology developers whowork at the University of Helsinki and the HelsinkiInstitute for Information Technology are translatingthe Semantic Web vision for a future semanticmuseum portal
The grouprsquos two-year project will run until spring2004 and is being carried out in co-operation withand with funding from major organisations including
A CULTURAL HERITAGE SEMANTIC WEB
EXAMPLE amp PRIMERBy Guntram Geser
1 Tim Berners-Lee
Interpretation and Semantics on
the Semantic Web (1998)
httpwwww3orgDesignIssues
Interpretationhtml2 Tim Berners-Lee Semantic
Web Road Map (1998)
httpwwww3org
DesignIssuesSemantichtml3 Robert DuCharme
httplistsxmlorgarchivesxml-
dev 200211 msg00190html
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Espoo City Museum Helsinki University MuseumNational Board of Antiquities NokiaTietoEnatorand the National Technology Agency (TEKES)
The major goals of the project are to makecollection metadata which stem from heterogen-eous databases semantically interoperable on theWeb and to provide facilities for semantic browsingand searching in the combined knowledge base ofthe participating museums
The projectrsquos vision is called ldquoFinnish Museums onthe Semantic Webrdquo (FMS) and its architecture allowsfor all Finnish museums to join in However in anapproach of starting small but ambitious the projectis at present using the collection databases of twomuseums the Espoo City Museum and the NationalMuseum of Finland Furthermore the implemen-tation is currently restricted to only one part of thecollections - textiles
In order to reach the FMS systemrsquos goal of makingthe museumsrsquo metadata semantically interoperable onthe Web the data must be harmonised on the syn-tactic and semantic level For this harmonisation theeXtended Markup Language (XML) and theResource Description Framework (RDF) are beingused of which RDF is the key language for achiev-ing semantic interoperability of the heterogeneoussets of metadata
RDF and Metadata - A Natural FitAn observer of the diffusion of the ResourceDescription Framework (RDF) into variousdomains Robert DuCharme has commentedlsquoI still find it a little ironic that while RDF hasgotten so much publicity as a technology for warmand fuzzy AI (Artificial Intelligence) pie-in-the-skytechnology itrsquos gotten most of its traction in themundane world of metadata3
Yet given the importance of metadata for theSemantic Web vision in general it does not come as
a surprise that metadata of key information com-munities belong to the first of RDFrsquos intended usesRDF seems to gain momentum in particular amongthe library and other communities that use DublinCore
The actual W3C RDF Primer (Working Draft23 January 2003) edited by Frank Manola and EricMiller labels RDF as lsquoan ideal representation forDublin Core informationrsquo and describes Dublin Coreas one of their lsquoRDF in the fieldrsquo examples (Cfhttpwwww3orgTRrdf-primer)
At the Dublin Core Metadata Initiative (DCMI)lsquoExpressing Simple Dublin Core in RDFXMLrsquo wasannounced as a DCMI Recommendation in October2002 lsquothe first in a series of recommendations forencoding Dublin Core metadata using mainstreamWeb technologiesrsquo ie XMLRDFXHTMLlsquoExpressing Qualified Dublin Core in RDFXMLrsquois currently a Proposed Recommendation Cfhttpdublincoreorggroupsarchitecture
Syntactic Transformation 1Creating the XML DocumentsIn the FMS system the eXtended Markup Language(XML) is used as the data transfer formatThis trans-fer format enables the system to make use of the dataoriginally stored in the museumsrsquo heterogeneouscollection databasesTherefore each museum parti-cipating in the FMS initiative provides the relevantcollection data as an XML document repository
In a process of syntactic harmonisation the datafrom a museumrsquos collection database are retrievedand transformed to an XML format conforming tothe XML Schema of the FMS initiative
The data to be published are read from the data-base through a lsquoviewrsquo which helps create the XMLformatThe view is a queryable interface a virtualtable that results from an SQL query which may joinmultiple tables of the databaseThrough the view thedata are queried so that the rows of the tables aregrouped by collection items For each item the setof rows is combined into a single XML document
XMLXML is a markup language for describing data Itis a language created to allow anyone to design thestructure of their own documentsAn XML docu-ment contains text that consists of markup in theform of tags and plain text between them the latterbeing just pure information (for exampleltcreatorgtAlexander Masterltcreatorgt) XML tagsare not predefined everyone can define his or herown tags
DigiCULT 27
FMS Documents
A short presentation is provided
in Eero Hyvoumlnen et al
Cultural Semantic Inter-
operability on the Web Case
Finnish Museums Online
tpiswc2002semanticweborg
postershyvonen_a4pdf for
detailed descriptions seeVilho
Raatikka Eero Hyvoumlnen
Ontology-based Semantic
Metadata Validation and
Hyvoumlnen Eero et al Semantic
Interoperability on the Web
Case Finnish Museums Online
Both texts can be found in
Towards the Semantic Web and
eb Services Proceedings of the
ML Finland 2002 Conference
httpwwwcshelsinkifiu
eahyvonexmlfinland2002
ProceedingsXML2002-finalpdf
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
ITEM_VIEW
item_id
type
subject
iconclass
creator
manuscript
place
year
28 DigiCULT
XML shares the syntax and bracketed tags of thewell-known HyperText Markup Language (HTML)but XML serves a different goalWhile HTML isused to define the layout of pages on the WWWXML is used to define the content of documentsfor example to specify that an area of text is thename of a creator
XML allows for creating markup (eg ltcreatorgt)that seems to carry some semantics However for acomputer a tag like ltcreatorgt carries as muchsemantics as a tag like ltH1gtA computer simplydoes not know what a creator is and how theconcept creator is related to other concepts (egmanuscript) For an XML processor ltH1gt andltcreatorgt or ltmanuscriptgt are all equally (andtotally) meaningless XML is all about describingdata on its own it does not do anythingThereneeds to be a processing program that uses themarkup to interpret the various pieces of elements
The graphic below illustrates the database rowsto XML process as described in the info boxIt provides a very simple example of an XMLdocument that describes some data for one of themedieval column miniatures from the KoninklijkeBibliotheekThe Hague which we were permitted touse for illustrating this Thematic Issue It includes theIconclass classification for this image 71A3421 Eveemerges from Adamrsquos body (for the hierarchical pathof this classification see the section on ontologies)
Short explanations for the XML document(image5kb78d38ixml) shown in the graphic below
A well-formed XML document is one thatconforms to the XML syntax rules of which wewould like to highlight the following
(1) The document must begin with the XMLdeclaration which defines the XML version and thecharacter encoding used in the document In theexample below we use ltxml version=10encoding=ISO-8859-1gt ie the documentconforms to the 10 specification of XML and usesthe ISO-8859-1 (Latin-1West European) characterset
(210 a) The XML document must contain asingle tag pair to define a root element in ourexample ltmiimagegt ltmiimagegt
Database rows Item data fromXML Documentgrouped by item database rows
image5kb78d38i lsquoColumnMiniaturelsquo lsquoEve emerges fromAdams Bodylsquo lsquo71A3421lsquolsquoAlexander Masterlsquo lsquoHistoric BiblersquolsquoUtrechtrsquo lsquocirca 1430rsquo
Rowsto XMLprocess
(1) ltxml version=10 encoding=ISO-8859-1gt
(2) ltmiimage xmlnsmi=rdquohttpwwwm-iorgimagesrdquo
image_id=rdquoimage5kb78d38irdquogt
(3) ltmitypegtColumn Miniatureltmitypegt
(4) ltmisubjectgtEve emerges from Adamrsquos bodyltmisubjectgt
(5) ltmiiconclassgt71A3421ltmiiconclassgt
(6) ltmicreatorgtAlexander Masterltmicreatorgt
(7) ltmimanuscriptgtHistoric Bibleltmimanuscriptgt
(8) ltmiplacegtUtrechtltmiplacegt
(9) ltmiyeargtcirca 1430ltmiyeargt
(10) ltmiimagegt
image5kb78d38ixml
Graphic 1 Database rows to XML process
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
(2b) Namespace Since element names in XML arenot fixed name conflicts can occur when differentdocuments use the same names describing differenttypes of elementsTo prevent such conflicts a uniquenamespace should be defined using a UniformResource Identifier (URI)An XML namespace is acollection of names that are used as element typesand attribute names (cf httpwwww3orgTRREC-xml-names) Our default namespace inthe start tag of the root element isxmlnsmi=rdquohttpwwwm-iorgimagesrdquo
The namespace prefix mi (for medieval images)functions as a placeholder for the namespace name Itneeds to show up in all element tags (eg ltmitypegtltmitypegt)
(2c) image_id=image5kb78d38iThis is the idattribute which contains the unique identifier of thedata source record
(3-9) All other elements must be within the rootelement and can themselves have sub-elements (childelements) which must be properly nested withintheir parent element Our elements do not have sub-elements
Other syntax rules are for example all start tagsmust match end-tags because XML tags are casesensitive (ie the tag ltmiCreatorgt is different fromthe tag ltmicreatorgt they must also be written withthe same case all elements must have a closing tag allattribute values must be within quotation marks (egimage5kb78d38i)
Syntactic Transformation 2The XML SchemaIn order to allow for syntactic harmonisation theXML documents of the museums should conform tothe XML Schema of the FMS initiativeThereforethe museums use the initiativersquos XML Schema whenthey create their XML documents for validatingthem against the Schema If the documents are validthe process can continue to the semantic level
XML SchemaThe XML Schema defines the building blocks of anXML document including| elements and attributes that can appear in a
document| which elements are child elements as well as
their order and number| whether an element is empty or can include text| the data types for elements and attributes| as well as default and fixed values for elements
and attributesXML with an XML Schema is designed to be self-descriptive One of the greatest strengths of XMLSchema is that it allows for data typingThe mostcommon data types are xsstring xsdecimalxsinteger xsboolean xsdate xstime In the examplebelow which is the XML Schema for the XMLdocument (image 5kb78d38ixml) shown in graphic1 we only use the data type xsstringThis data typeis used for values that contain character strings
Short explanations(1) The XML declaration which states that thedocument conforms to the 10 specification of XML(2a) Determines that the elements and data types thatare used to construct the schema come from theW3Crsquos XML Schema namespace Consequently each
DigiCULT 29
(1) ltxml version=10gt(2a) ltxsschema xmlnsxs=httpwwww3org2001XMLSchema(2b) targetNamespace=rdquohttpwwwm-iorgimagesrdquo(2c) xmlns=rdquohttpwwwm-iorgimagesrdquo(2d) elementFormDefault=qualifiedgt(3) ltxselement name=imagegt(4) ltxscomplexTypegt(5) ltxssequencegt(6) ltxselement name=type type=xsstringgt(7) ltxselement name=subject type=xsstringgt(8) ltxselement name=iconclass type=xsstringgt(9) ltxselement name=creator type=xsstringgt(10) ltxselement name=manuscript type=xsstringgt(11) ltxselement name=place type=xsstringgt(12) ltxselement name=year type=xsstringgt(13) ltxssequencegt(14) ltxsattribute name=image_id type=xsstring
use=requiredgt (15) ltxscomplexTypegt(16) ltxselementgt(17) ltxsschemagt
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
30 DigiCULT
of the elements and data types in the schema hasthe prefix xs which identifies them as belonging tothe vocabulary of the XML Schema language ratherthan the vocabulary of our (fictitious) organisationM-iorg(2b) Indicates that the elements defined by thisschema come from our httpwwwm-iorgimagesnamespace(2c) Is our default namespace(2d) Demands that any elements used by an XMLdocument which were declared in this schema mustbe namespace qualified(316) The parent element for the data typing of theimage descriptions we provide at httpwwwm-iorgimages It is (4) defined as a xscomplexTypeie it contains child elements (6-12) which are(513) surrounded by an xssequence element thatdefines an ordered sequence of these elements(6-12) The child elements which in our exampleare simple types because they do not contain otherelementsThey define various elements of our XMLdocuments egldquoimagerdquo to be of the data typexsstring(14) Furthermore the Schema determines that forthe element xselement name=image there is arequired attribute image_id of the datatype=xsstring (for example image5kb78d38i)
Major Benefits of XMLIn the context of the Semantic Web XML providesan interoperable syntactical foundation upon whichsolutions to the issues of representing relationshipsand meaning can be builtWe also want to highlightthe many benefits of XML that are adding to itsrapid uptake in the first place and might in thelonger term be supportive in realising the SemanticWeb vision on a broader scale
XML is one of the most important standardsdevelopments in recent years It is an internationaluniversal non-system and non-application specificdata exchange standard XML is internationalbecause it employs UnicodeThis means that thereis no restriction to the western alphabet but ArabicChinese Greek HebrewThai etc can be easilyintegrated
XML is non-system specific because it is an openstandard set by the World Wide Web Consortium(W3C)As such there is no owner of XMLAll themajor software suppliers support it it can be used onany computing platform from Windows and MacOSto LinuxThis makes it easier for organisations tochange systems or combine different systems
XML is also non-application specific ie it can be
used in various applications such as data exchangedata harvestingWeb site management etc XML isgaining ever-wider acceptance in many applicationdomains including in particular the cultural heritagecommunity
Bear in mind also that the major collectionmanagement software producers have implementedsupport for XML in their systems enabling forexample the integration of data from differentcollections and their combination over the Web
XML allows for multi-channel publishing iewith XML it is easy to produce different products orservices from digital cultural heritage assets Once thedata are structured in XML they can be displayedacross a variety of media using an associated stylesheet that contains the display information
Finally XML can be used to create new languagesFor example the Wireless Markup Language (WML)which is used to markup Internet applications forhandheld devices is written in XML
Ontological conceptsThe goal of the FMS project is to make metadataof the museumsrsquo textiles collections semanticallyinteroperable on the Web In order to achieve suchinteroperability an ontology is being designed thatdescribes the common (lower-level) ontologicalconcepts in this domain of knowledge4
OntologyIn the Semantic Web architecture the semanticrelationships are not embedded but explicitlyrepresented by an ontology or rather an interrelatedset of ontologies In fact the wide array of informat-ion residing on the Web and the perceived need tomake it more machine-processable have acted as astrong impetus for the development of ontologylanguages
Yet what is an ontology An abstract definitionof an ontology is that it describes a formal sharedconceptualisation of a particular domain of interestfor example cultural heritage objects held in artmuseums In particular an ontology allows forconstraining expressing and analysing the intendedmeaning of the shared vocabulary of concepts andrelations in a domain of knowledge5
If these concepts and relations are formalised to ahigh degree the domain has at hand a major buildingblock for developing semantically aware informationsystemsWith Semantic Web technologies thedomain ontology can be made available on thenetwork cross-referenced with upper-level and
4 The available documents (in
English) on the FMS initiative
state that their ontology is being
created using RDF Schema
(RDFS)To develop a fully
fledged ontology advanced
languages such as DAML+OIL
or Web Ontology Language
(OWL) would be required5For more elaborate and formal
descriptions see Tom Gruber
What is an Ontology (1995)
httpwww-kslstanfordedu
kstwhat-is-an-ontologyhtml
Nicola Guarino Ontology-
Driven Conceptual Modelling
part 1-3 (2002)
httpontologyiprmcnrit
Tutorials
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
other domain ontologies6 and remote applicationsor intelligent software agents can refer to it whenthey interact to provide a certain information service
The degree of formalisation of concepts and theirrelations varies considerably between differentdomains of knowledgeAt the lower end one findslexicons and simple taxonomies (ie an orderedclassification system where terms are relatedhierarchically)At the middle level one might placethesauri ie controlled vocabularies that arestructured to show relationships between terms andconcepts and for example allow for retrieving themfrom a databaseAt the high end of formalisation ofknowledge there are axiomatised logic theories Suchtheories include rules to ensure the well-formednessand logical validity of statements expressed in thelanguage of the scientific discipline
In the cultural heritage sector a powerful IT-supported example of a hierarchical classificationsystem is IconclassThis supports the documentationof images in particular art historical images byproviding a systematic collection of 28000 ready-made definitions of objects persons eventssituations and abstract ideas that can be the subject ofan imageThe definitions consist of an alphanumericclassification code and its textual correlate7
For example for the image we have described inXML in graphic 1 the Iconclass definition is71A3421 Eve emerges from Adamrsquos bodyTheMedieval Illustrated Manuscripts Website of theKoninklijke BibliotheekThe Hague has an IconclassBrowser in place8 that provides the hierarchical pathfor this concept in the classification system
An outstanding example of a controlledvocabulary is the Art amp Architecture Thesaurus(AAT) one of the Getty Research InstitutersquosVocabulary Databases It is a structured vocabularyof more than 125000 terms and other informationabout concepts that are used for describing fine artarchitecture decorative arts archival materials andmaterial culture9
The W3Crsquos Resource Description Framework
(RDF) Schema Specification 10 in a section onits scope mentions concept navigation and stateslsquoThesauri and library classification schemes arewell known examples of hierarchical systems forrepresenting subject taxonomies in terms of therelationships between named conceptsThe RDFSchema specification provides sufficient resourcesfor creating RDF models that represent the logicalstructure of thesauri (and other library classificationsystems)rsquo10
Yet for realising a full-blown cultural heritageontology for the Semantic Web there are currentlylimitations on both sides On the one hand hier-archical classification systems and structuredvocabularies do not lend themselves easily torich inter-linking of conceptual lsquotreesrsquo
A major step further in this direction is theCIDOC object-oriented Conceptual ReferenceModel (CRM)11 This provides an ontology of 81classes and 130 properties which describes in aformal language concepts and relations relevant tothe documentation of cultural heritage
On the other hand RDF Schema has limitationswhen it comes to expressing complex ontologicalrelationships New languages based on descriptionlogics are being developedThese include DAML+OIL and the upcoming Web Ontology Language(OWL) which are capable of fully describingontologies
Also worth highlighting is that tools for ontologybuilding are proliferating at the present timeTheseontology editors need to be carefully assessed as theircapabilities differ considerably12
DigiCULT 31
7 Bible71 Old Testament
71A Genesis from the creation to the expulsion from paradiseand later years of Adam and Eve
71A3 creation of man the Garden of Eden (Genesis 126-2)71A34 creation of Eve
71A342 Eve is fashioned from Adamrsquos rib71A3421 Eve emerges from Adamrsquos body
6Upper-level ontologies
describe the basic concepts and
relationships invoked when
information about any domain is
expressed in natural language7For in-depth information see
the official Iconclass Website
httpwwwiconclassnl8Medieval Illustrated
Manuscripts Website
httpwwwkbnlkb
manuscripts browser
The subject access system for
the Website was conceived by
Mnemosyne Partners building on
the Iconclass classification system
nd technologies See the valuable
information they provide at
httpwwwmnemosyneorg
businessmsstempexhtml9A detailed description
of the AAT is provided at
httpwwwgettyeduresearch
toolsvocabularyaatabouthtml10httpwwww3orgTR
000CR-rdf-schema-2000032711See the information box
at the end of the Forum
discussion and the sources
mentioned in the Semantic
Web Terms and Reading List12Michael Denny Ontology
BuildingA Survey of
Editing Tools (06-11-2002)
httpwwwxmlcompuba
20021106ontologieshtml
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
32 DigiCULT
Semantic Transformation 1The RDF Data ModelProgressing towards semantic interoperability themetadata in the XML documents are now transformedinto RDF statements corresponding to the RDF datamodelWith these so-called RDF lsquotriplesrsquo the XMLmetadata elements are mapped to the RDF classes andproperties which are defined by the RDF Schema ofthe FMS initiative (see section 3)
lsquoXML is nothing more than a way to standardize dataformatsThis is not to underplay XMLrsquos importanceA data-format standard makes all of the more glamoroustechnologies possible and RDF is the leading example ofthe benefit that comes once the data format has beenstandardized Many proclaim that RDF is really theXMLrsquos killer app and with good reason Despite all thisRDF remains somewhat obscureThis is mainly because atits core RDF is very abstract very dry and very academicrsquoUche OgbujiAn introduction to RDF (2000)httpwww-106ibmcomdeveloperworkslibraryw-rdfdwzone=xml
RDF Data ModelIn order to make Web resources semanticallyinteroperable we need resources that providemachine-understandable information about them-selves In the Semantic Web architecture thesestatements are built by using the ResourceDescription Framework (RDF)RDF defines a data model for the statementsdescribing typed relationships between uniquelyidentified sources RDF distinguishes between
| resources familiar examples are for example a Web page electronic document or digital imagebut in RDF also entities that are not lsquonetwork retrievablersquo eg museums curators or bound medieval manuscripts can be resources
| properties these identify a specific aspectcharacteristic attribute or relation used to describe the resource
| statements these associate a value for a named property with the resource
Hence RDF provides a model for describingrelationships between resources in terms of namedproperties and valuesThe RDF data modelintrinsically supports only binary relations Its baseelement is the lsquotriplersquo which takes the form ofsubject predicate object a resource (the subject) islinked to another resource (the object) through anarc labelled with a third resource (the predicate)The semantics of a triple clearly depends on theproperty used as predicate
A convenient way to visualise this is to draw nodesfor subject and object and an arrow between themfor the predicate (see graphic 2) In this labelleddirected graph subject and predicate (property) areUniform Resource Identifiers (URIs) and the objectis either a URI or a literal (which is drawn as a box)Everything in RDF can be represented by a graphwith nodes and arcs and the data model allows forusing the same URI as a node and as an arc labelTorepresent RDF statements in a machine-processableway RDF builds on XMLWith RDFXML aspecific XML markup language RDF informationcan be represented and ex-changed betweenmachines
httpwwwm-iorgimagesschemasImage
httpwwwm-iorgschemasimagesColumnMiniature
Subject
Predicate
Object
With these two triples we state thatColumnMiniature is a subclass of Miniatureand that Miniature is a subclass of Image
The predicate in our statements is the rdfssubClassOf property which is predefined in the RDF Schema namespacehttpwwww3org200001rdf-schema
The classes Image Miniature andColumnMiniature would also need to be defined in our RDF Schema namespacehttpwwwm-iorgschemasimages
For details on how to use RDF Schema for defining your domain ontology see section 3In a nutshell RDF Schema is a lsquohigher-levelrsquolanguage which is itself defined using RDF
Graphic 2 RDF Data Model
httpwwwm-iorgimagesschemasMiniature
httpwwww3org200001rdf-schemasubClassOf
Subject
Predicate
Object
httpwwww3org200001rdf-schemasubClassOf
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Semantic Transformation 2 Creating andValidating the RDF StatementsIn the mapping process an editor tool (in the FMScase a tool developed by the project team calledMeedio) receives as input the XML documents andassists in transforming them into semantically validRDF statements (instance descriptions)The toolserves as instance editor which provides a convenientway of finding and selecting from the XML metadataelements correct instance values for a particularpropertyThe editor tool also serves as semanticmetadata validatorWhen the museum cataloguersaves the set of RDF statements corresponding to anXML document the semantics of these statementsare validated against the property constraints of theFMS ontologyThe result of a successful mapping andvalidation process is a unique set of RDF triplescalled the RDF card
RDF Mapping RulesWhen RDF is used to define the meaning of XMLmetadata elements a set of mapping rules is createdA mapping rule is a template of RDF triples whereXPath expressions are used to identify the actualelement values XPath is a language for addressingparts of an XML document and was designed foruse in XML parsing software (XSLT XPointer andothers)
When applying such a rule to an XML documentthe XPath expressions are instantiated with matchingelement values If the rule matches the RDF temp-late evaluates to a set of RDF triples where XPathexpressions are substituted by the correspondingvalues of the XML elementsFor example by applying the template rule ltimage5kb78d38itype mihasCreatorimage5kb78d38icreatorgtto the XML document described in the sectionon XML the following result would be obtainedltImage Miniature mihasCreator lsquoAlexanderMasterlsquogt
mihasCreator is an example of a RDF propertySuch properties are explained in section 3 RDFSchema
Note Due to the limited space permitted wedo not address issues of term mappingThis is animportant aspect of the mapping and validationprocess carried out in the FMS projectWorking withmetadata from different museums they need to dealwith partly different terminologiesTheir technicalsolution to synonymous terms (ie different termsreferring to the same concepts) is to attach synonymsets to the FMS ontology classesWith situationswhere polysemous terms occur (ie the same termsrefer to different concepts) the editor tool cannotcope and the cataloguer needs to select the correctinterpretation
Semantic Transformation 3The RDF Schema (RDFS) The shared ontology for the textiles domain iscreated by using Resource Description FrameworkSchema (RDFS)An RDF Schema is a tool forindicating the classes of resources one wants todescribe as well as for defining the properties used todescribe those resources Furthermore classsub-classrelationships and propertysub-property relationshipscan be definedThe museums are mapping theirmetadata to the classes and properties defined by theRDF Schema of the FMS initiativeThereby they aremaking the meaning of the metadata explicit andrepresenting them in a harmonised uniform way
RDF Schema In section Semantic Transformation 1 we havedescribed the data model provided by RDF forexpressing statements about Web resources But wealso need a vocabulary for the RDF statementsnamely classes and properties defined with RDFSchema (RDFS)
In brief the RDF Schema mechanism provides apre-defined vocabulary a basic type system that canbe used in creating domain-specific schemas Its roleis to allow for declaring metadata properties (eg forlsquotypersquo lsquosubjectrsquo or lsquocreatorrsquo) to define the classes ofresources they may be used with to restrict possiblecombinations and to detect violations of thoserestrictions
Defining classes
With RDF Schema (RDFS)Web resources canbe defined as instances of one or more classes Inaddition classes can be organised in a hierarchicalfashionAs we hold a collection of digital images
DigiCULT 33
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
34 DigiCULT
drawn from illustrated medieval manuscripts wefirst need to define a class of things that are imagesIn RDF Schema a class is any resource having anrdftype property whose value is the RDFS-definedresource rdfsclass
So using the basic RDF data model we definemiImage [resource] rdftype [property] rdfsClass[value]The self-defined prefix mi (for medievalimages) stands for the URI reference of our RDFSchema namespace httpwwwm-iorgschemasimages
In our image collection we have various specialkinds of digitised images such as column miniaturesdecorated initials schematic drawings etcTo distin-guish for example the miniatures first we need todefine a general class Miniature and subclasses ofminiatures eg a subclass ColumnMiniaturemiMiniature rdftype rdfsClass miColumnMiniature rdftype rdfsClass
Secondly we need to define thatmiColumnMiniature is a subclass of miMiniatureand that miMiniature is a subclass of miImagefor which we use the predefined rdfssubClassOfpropertymiMiniatures rdfssubClassOf miImage miColumnMiniature rdfssubClassOf miMiniature
As the rdfssubClassOf property is transitive thismeans that miColumnMiniature is also implicitly asubclass of miImage
Graphic 2 on page 32 visualises this with the nodesand arcs of the basic RDF data model
Defining properties
In order to make the meaning of our metadata (ielsquotypersquo) explicit we need to be capable of declaringspecific properties that characterise the classes ofthings we hold at httpwwwm-iorg eg digitalimages of medieval column miniatures
Basically RDF schema defines properties in termsof the classes of resources to which they applyThisis the role of the rdfsdomain and rdfsrangemechanisms
rdfsrange
The range constraint defines the class or set of classeswhose instances can be values of a particular pro-perty If we want to define the property mihasTypewe must describe this resource (which we locate athttpwwwm-iorgschemasimages) with anrdftype property whose value is rdfPropertymiColumnMiniature [resource] rdftype [property]rdfProperty [value]
The following RDF statements indicate thatmiColumnMiniature is a class mihasType is a proper-
ty and RDF statements using the mihasType pro-perty have instances of miColumnMiniature as valuesmiColumnMiniature rdftype rdfsClass mihasType rdftype rdfPropertymihasType rdfsrange miColumnMiniature
rdfsdomain
The domain constraint restricts the set of classeswhose instances may have a particular propertyattached to them If we want to indicate that theproperty mihasType applies to instances of classmiColumnMiniature we would writemiColumnMiniature rdftype rdfsClassmihasType rdftype rdfPropertymihas Type rdfsdomain miColumnMiniature
Benefits of RDFIn a SearchWebServicescom definition of RDFsome benefits of RDF are mentioned| lsquoBy providing a consistent framework RDF
will encourage the providing of metadata about Internet resources
| Because RDF will include a standard syntax for describing and querying data software that exploitsmetadata will be easier and faster to produce
| The standard syntax and query capability will allowapplications to exchange information more easily
| Searchers will get more precise results from searching based on metadata rather than on indexes derived from full text gathering
| Intelligent software agents will have moreprecise data to work withrsquo13
This is a well-crafted listing of RDF benefits fromprovision and exchange of better metadata to agentsworking with them hopefully for the benefit of humans But as explicitly stated by SearchWeb-Servicescom these are only potential benefits iethey depend on the level of actual uptake of RDF
13 whatiscom
searchWebServicescom
Definitions - Resource
Description Framework
httpsearchwebservices
techtargetcomsDefinition
0sid26_gci21354500html
(last updated July 27 2001)
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Generating and Using the Knowledge SpaceThe RDF cards represent the original XMLdocuments at the semantic levelThe union of suchRDF cards constitutes a knowledge base which isa harmonised semantic representation of the under-lying heterogeneous databases
However so far the RDF instance descriptionshave not left the museumThe museum has completecontrol of the information it wants to publish and itdoes not need to allow the FMS system access to itsinternal database systemThe RDF data are placed ina public directory on the museumrsquos WWW server
The Web crawler of the FMS system harvests theinstance descriptions from the different museums andthe system combines them into an RDF repositoryThis repository is a large semantic graph that consistsof the shared ontology and metadata
How does a user now search and navigate in thisknowledge space In the FMS system this is imple-mented by a server-side software called OntogatorBased on the semantic graph this software dynami-cally generates semantic linkages for the userrsquos Webbrowser
One way of using the FMS system is view-basedfilteringThe user can select classes of resources fromthe ontology and the system finds the instances thatmatch the selected class restrictions By constrainingclasses (views) further the collection instance datasearched for are eventually found
The software also supports topic-based navigationby providing semantic links between topics of inter-est the creation of which is based on the collectiondomain ontology and the related metadata of thecollection recordsThis means that the links alsoprovide the user with an impression of the widercontext and pragmatics of the objects in themuseumsrsquo collections
From human users to software agentsAs described in the Finnish Museums on theSemantic Web example the RDF repository is alarge semantic graph that consists of the sharedontology and metadata of the participating museumsSuch a repository can be queried and the results a setof pointers to the relevant resources can be accessedusing Web browsersThe opportunities provided by asystem like the one developed by the FMS initiative(eg topic-based navigation) are at present restrictedprimarily to human users
The Semantic Web vision includes intelligent soft-ware agents which lsquounderstandrsquo semantic relationshipsbetween Web resources and seek relevant informationas well as perform transactions for humans14
This software would be capable of autonomousaction ie could run without direct human controlor constant supervision and ideally is very flexible indoing this Characterisations of this flexibility includeactions that are lsquoreactive lsquoproactiversquo and lsquosocialrsquo (seebelow)
While the basic idea of agents is very intuitive andappealing the actual theory is complex the tools areimmature the solutions small and prototype-basedIn fact as a parallel distributed systems technologyagents belong to the most complex class of softwaretechnology
However this primer will conclude with a sum-mary of what an intelligent software agent is andwhat such a software would generally be capable ofdoingThis should also serve as an indication of howgreat the challenge for research and technologicaldevelopment is to make the full Semantic Web visiona reality
Intelligent Software Agents
The following definitions are taken from MichaelWooldridgersquos introduction to multiagent systems15
Agent
lsquoAn agent is a computer system capable of autono-mous action in some environmentrsquo
Intelligent agent
lsquoAn intelligent agent is a computer system capableof flexible autonomous action in some environmentrsquo
Flexible autonomous action
lsquoBy flexible autonomous action we mean reactiveproactive socialrsquo| Reactivity lsquoA reactive system is one that maintains
an ongoing interaction with its environment and responds to changes that occur in it (in time for the response to be useful)rsquo
| Proactiveness lsquoAn agent serves a purpose and therefore exhibits goal-directed behaviour in-cluding the capacity to recognise opportunitiesfor useful courses of actionrsquo
| Social ability lsquoSocial ability in agents is the ability to interact with other agents (and possibly humans)via some kind of agent communication languageand perhaps cooperate with othersrsquoDesirable further properties of agents are
| Mobility the ability to move around an electronic network
| Rationality an agent will act in such a way that it does not prevent itself from achieving its goals (as far as this is possible with a limited set of beliefs representing its world knowledge)
| Learning an agent will improve its performance over time
DigiCULT 35
14 T Berners-Lee J Hendler
O Lassila Scientific American
May 2001
httpwwwsciamcom2001
0501issue0501berners-leehtml15 MWooldridge
An Introduction to
Multiagent Systems
ChichesterWiley 2002 and
httpwwwcsclivacuk
~mjwpubsimas
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
ResourcesIn the primer no references are made to thedocuments of the World Wide Web Consortium(W3C)All relevant W3C recommendations canbe found at httpwwww3corg
Of the wealth of introductory materials on XMLand RDF available on the Web the following inparticular are useful to consult for further details
httpwwww3schoolscomxmlhttpwwww3schoolscomschemahttpwwww3orgTRrdf-primer
XML repository
OntologyRDFS
K S Candan H Liu R Suvarna ResourceDescription Framework Metadata and itsApplications In SIGKDD ExplorationsVol 31 (2001) 6-19
Pierre-Antoine Champin RDF Tutorial (2001)httpwww710univ-lyon1fr~champinrdf-tutorialrdf-tutorialhtml
S Decker M P Mitra S Melnik Framework forthe Semantic WebAn RDF Tutorialhttpwwwidaliuse~asmpacoursesswebrdfrdf_tutorialpdf
36 DigiCULT
The Finnish Museums on the Semantic Web Overview of the Systemrsquos Set-up
User ClientWWW BrowserTopic-based navigationView-based filtering
Server Ontogator software
RDF database
Semantic graphKnowledge space ofshared ontology andmetadata
RDF Schema Semanticinteroperability
XML Schema Syntacticinteroperability
Relational SchemasDBMS
Web crawler
RDF instances
Collectiondatabase 1
Metadata Editor XML repository XML repository
RDF instances RDF instances
Collectiondatabase 2
Collectiondatabase n
Graphic 3 Set-up of the Finnish Museums on the Semantic Web System
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
38 DigiCULT
Wernher Behrendt Salzburg Research Austria Wernher Behrendt is a Senior Researcher at theSun Technology and Research Excellence Center atSalzburg Research (Austria) working on multimediamiddleware and interoperation issues He holds anMSc in Cognitive Science from Manchester Uni-versity and has more than 10 yearsrsquo experience innear-to-market IT research From 1989 to 1995 hewas a Senior Research Associate in the InformaticsDepartment at Rutherford Appleton Laboratory(UK) working on embedded knowledge basedsystems in distributed multimedia presentationsystems From 1995 to 1998 he was a SeniorResearch Associate at Cardiff University (UK)working on interoperation between heterogeneousinformation systems Mr Behrendt has held coursesin Computer Science and has worked on projectsranging from software engineering methods andquality assurance to legacy system reengineeringusing migration methods and distributed systemsmiddlewareE-mail wernherbehrendtsalzburgresearchat
Paolo BuonoraArchivio di Stato di Roma ItalyPaolo Buonora holds a degree in Philosophy fromthe University of Rome lsquoLa Sapienzarsquo (1976) Heworked in the Italian State Archive Administrationfrom 1978 where he was first involved in editing theGuida Generale degli Archivi di Stato italiani From1986 he worked in the Soprintendenza archivisticaper il Lazio surveying audiovisual archives muni-cipal archives from 1989 to 1991 at the PerugiaUniversity engaged in a doctoral research in lsquoUrbanand rural historyrsquo and from 1991 to 1994 again inthe Soprintendenza archivistica per il LazioAfter1994 he worked in the Archivio di Stato di Romawhere he was responsible for the photograph ser-vice and several working groups on informaticsapplication in archival documentation From 1997until the present time he has planned and directedthe Imago II project in the Archivio di Stato diRomaSee httpwwwasrmarchivbeniculturaliitsidimagoIMAGOIIenhtml
Samuel Cruz-Lara University of Nancy 2and LORIA FranceSamuel Cruz-Lara obtained a Masterrsquos degree inComputer Science in 1984 (University HenriPoincareacute Nancy 1) and a PhD degree in ComputerScience in 1988 (National Polytechnic Institute ofLorraine)The central topic of his PhD thesis was thegeneration of integrated development environmentsby using attribute grammarsHe is currently Associate Professor at the Universityof Nancy 2 (Institute of Technology ComputerScience Department) and permanent Researcher atLORIA (Lorraine Laboratory for Research in Com-puter Science and its Applications ndash UMR 7503 ndashCNRS ndash INRIA ndash Universities of Nancy) He is amember of the lsquoLanguage and Dialoguersquo team and hasconducted several research activities on distributedsoftware architectures and textual linguistic resourcesmanagement He is currently working in the contextof distributed architectures and multimedia resourcesmanagement Dr Cruz-Lara has participated in severalprojects in particular CNRSSILFIDE and MLIS-ELAN and he is at present co-leader of the lsquoDigitalMuseumrsquo projectThis is a joint project betweenLORIA and the National Chi-Nan UniversityThelsquoDigital Museumrsquo project is sponsored by theNational Science Council of the Republic of China(Taiwan) and supported by INRIA (France)
Costis Dallas Critical Publics SA GreeceCostis Dallas is Chairman and Senior Researcher ofCritical Publics (httpwwwcriticalpublicscom) agroup of companies active in the field of strategiccommunications creative design and technologyHe is currently a Lecturer in the Department ofCommunication and Mass Media of PanteionUniversity and has over 15 years of research andprofessional experience in hypermedia applicationshuman factor issues and cultural information systemsDr Dallas has been co-founder and Executive Vice-President of ISP Hellas Online SA co-founder andChair of the Multimedia Working Group of theInternational Council of Museums (CIDOCICOM)Head of Documentation and Systems of the Benaki
THE DARMSTADT FORUM
PARTICIPANTS
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Museum General Director of the Foundation of theHellenic World Special Secretary of the GreekMinistry of Education in charge of libraries archivesand instructional technologies and Special Advisor tothe Greek Foreign Minister on cultural andinformation technology issuesE-mail infocriticalpublicscom
Bert Degenhart-Drenth ADLIB InformationSystems BVThe NetherlandsBert Degenhart-Drenth is the founder and generalmanager of ADLIB Information Systems BV(httpwwwnladlibsoftcom) a leading companyin the field of library museum and archive auto-mationAlthough his background is in electronicengineering he became involved in a museumautomation project as early as 1983 Degenhart-Drenth worked for the MARDOC foundation inRotterdam for three years setting up one of thefirst integrated museum automation systems in TheNetherlandsAfter that he joined Databasix producerof the ADLIB software In 1991 Degenhart-Drenthled a management buy-out to form DatabasixInformation Systems which was renamed later asADLIB Information SystemsADLIB Information Systems is now the marketleader in museum automation in the Benelux regionand has more than 1000 customers in the librariesmuseums and archives fieldADLIB is a CIMImember and has as such been involved in the CIMIZ3950 and Dublin Core test beds Degenhart-Drenth has been a core member in the developmentof the Spectrum-XML schema and collaborates inthe EMII-DCF project Relevant ADLIB projectsinclude CIMI Dublin Core test bed - ADLIB hoststhe CIMI Dublin Core test bed and has developeda database application for thisThe Open ArchivesInitiative Protocol version 10 is now available forthis database in addition to the lsquostandardrsquo HTMLXML access SPECTRUM-XML - A project of theUK mda together with a team of software vendorsto produce a schema for exchange of data whichcontains information elements from the UKSpectrum standardThis will be implemented in
the ADLIB Museum software Internet GelderseMusea (httpwwwigemnl) and Maritiem Digitaal(httpwwwmaritiemdigitaalnl)Two Web-basedprojects that make the data from multiple museums(including their library data) available on the Webbased on a three-tier implementation with XML asthe data exchange mechanism E-mailbertnladlibsoftcom
Nicola Guarino Italian National ResearchCouncil ItalyNicola Guarino is a senior researcher at the Institutefor Cognitive Sciences and Technologies of the ItalianNational Research Council where he leads theLaboratory for Applied Ontology He graduated inElectrical Engineering from the University of Padovain 1978 He has been active in the ontology fieldsince 1991 and has played a leading role in theAI community in promoting the study of theontological foundations of knowledge engineeringand conceptual modelling under an interdisciplinaryapproach His current research activities involveformal ontology ontology design knowledge sharingand integration and ontology-based metadatastandardisation He is general chairman of theInternational Conference on Formal Ontology inInformation Systems (FOIS) and associated editorof the International Journal of Human-ComputerStudies He has published more than 60 papers inscientific journals books and conference proceedingsand has been guest editor of three special issues ofscientific journals related to formal ontology andinformation systems He is involved in variousprojects related to ontologies and the SemanticWeb including WonderWeb and OntoWebE-mail NicolaGuarinoladsebpdcnrit
Janneke van Kersen Digital HeritageAssociationThe NetherlandsJanneke van Kersen graduated in Art History andtook part in a postgraduate programme in HistoricalInformation ProcessingAfter her graduation in 1992she worked in the field of Humanities andComputing She held several posts at Utrecht
DigiCULT 39
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
40 DigiCULT
University as a teacher in the Department ofHumanities and Computing and more recently shewas responsible for the realisation of computer-aidedapplications for the Department of Art HistoryFurthermore she worked at the Netherlands HistoricData Archive and taught in the postgraduateprogramme on Historical Information Processingat Leiden UniversitySince October 1999 Kersen has been working asa consultant with the Dutch Digital HeritageAssociationThe consulting focuses on digitisationstandardisation metadata and education amp ICTThemain goal of the organisation is to provide access todistributed databases of cultural heritage organi-sations such as museums archives archaeologicalorganisations monuments and special collectionsof libraries in a context-rich and XML-basedenvironment Interoperability from both a technicaland an organisational perspective is therefore a majorissueThe Association offers access to heritageinformation to the general public athttpwwwcultuurwijzernl and customisedaccess for the educational field athttpwwwcultuurwijsnlE-mail jannekevankersendennl
Marco Meli EDW International ItalyMarco Meli is CEO and co-founder of EDWInternational (httpwwwedw-internationalcom)Milan Italy a company providing leading corporatepublishing and content management applicationsbased on XML and related standards Meli has longexperience in contentdocument management and inmultimedia creation and production He is a memberof the Organisation Group of XML Italia VP ofSGML UG Italia He has given a number of presen-tations at SGML and XML related conferences inItaly Europe and the US Meli is also editor of acolumn on new facets of publishing in Graphicusmagazine He has been involved in cultural projectssince 1996 and acts as a reviewer of InformationSociety Technologies (IST) projects for the EuropeanCommissionE-mail meliedwit
Paul Miller UKOLN UKDr Miller holds the post of Interoperability Focus atUKOLN (UK Office for Library and InformationNetworking httpwwwukolnacuk)Interoperability Focus is jointly funded by the JointInformation Systems Committee (JISC) of the UKrsquosFurther and Higher Education Funding Councils andResource the Council for Museums Archives andLibrariesThe post is responsible for exploringpublicising and mobilising the benefits and practiceof effective interoperability across diverse informationsectors including libraries and the cultural heritageand archival communities Dr Miller sits on a numberof relevant committees including the ExecutiveCommittee of the CIMI Consortium the AdvisoryBoard of the Dublin Core Metadata Initiative(DCMI) and the Metadata Working Group ofthe UK Governmentrsquos Office of the endashEnvoyE-mail PMillerukolnacuk
Frank Nack CWI UKDr Frank Nack is a senior researcher at CWIcurrently working within the Multimedia andHuman-Computer Interaction group He obtainedhis PhD on lsquoThe Application of Video Semantics andTheme Representation for Automated Film Editingrsquoat Lancaster University UKThe main thrust of hisresearch is on video representation digital videoproduction multimedia systems that enhance humancommunication and creativity interactive storytellingand media-networked oriented agent technology Heis an active member of the MPEG-7 standardisationgroup where he served as editor of the Context andObjectives Document and the Requirements Docu-ment and chaired the MPEG-7 DDL developmentgroup He is on the editorial board of IEEEMultimedia where he edits the Media ImpactcolumnE-mail FrankNackcwinl
Franco Niccolucci Florence University ItalyFranco Niccolucci has a background in mathematicsand computer science and is at present a professor atthe University of Florence where he lectures in the
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Faculty of Architecture and at the School ofArchaeology and at the University of Basilicatawhere he lectures in the Faculty of CulturalHeritage He is also director of the Laboratoryfor Virtual Archaeology and Digital Culture in thePrato campus of the University of Florence He isa member of several professional associations and isthe Italian representative on the internationalSteering Committee of CAA the association forComputer Applications to Archaeology His interestsconcern virtual archaeology and the use of multi-media in cultural heritage communication thesubjects of the two most recent books edited by himHis present research deals with virtual reality modelsthe use of XML for archaeological and historic datamanagement and in general the impact of infor-matics on archaeological theory and method Hehas been a member of the scientific committees ofthe most recent international conferences on thesesubjects and will chair the 2004 CAA InternationalConference See alsohttpwwwgeogportacukhist-boundpeopleniccoluccihtm
Seamus Ross HATIIUniversity of Glasgow UKDr Seamus Ross is Director of Glasgow UniversityrsquosHumanities Advanced Technology and InformationInstitute (HATII) He is also Director of ERPANET(Electronic Resource Preservation and Network)(IST-2001-32706) a European Union fundedaccompanying measure to enhance the preservationof cultural heritage and scientific digital objectsPreviously he was Assistant Secretary for InformationTechnology at the British Academy and before thatworked for a company specialising in expert systemsand software development as a software engineer andthen in management He researches lectures andpublishes widely on information technology anddigital preservation Dr Ross acts as ICT advisor tothe Heritage Lottery Fund and is a monitor for anumber of large ICT-based projects in the UK He isa member of a number of international organisationsincluding the DLM-Monitoring Committee of the
European Commission the Research LibrariesGrouprsquos PRESERV Working Group on PreservationIssues of Metadata and InterPARES (as well as Co-Chair of its European Team)E-mail SRosshatiiartsglaacuk
Andrea Scotti Institute amp Museum for theHistory of Science ItalyAndrea Scotti graduated from the University ofBologna Department of Philosophy in 1983 From1985 to 1995 he carried out several research projectsinvolving the cataloguing of scientific manuscripts inIsrael (Hebrew University of Jerusalem) Czecho-slovakia (Karol University of Prague) Hungary(Szecheny National Library of Budapest) andGermany (Institut fuumlr Geschichte der Natur-wissenschaften Munich) His numerous researchactivities concentrate on software and programmingfor library databases
Since 1996 Scotti has been Director of theGeneral Digital Catalogue of the ScientificManuscripts located at the Central National Libraryin Florence developed in co-operation with theIstituto e Museo di Storia della Scienza the NationalCentral Library and under the auspices of the ItalianMinistry for Culture He is in charge of the workthe Institute and Museum of the History of Sciencecarries out for the MESMUSES project (010201-300703) funded by the European Commissionunder the Information Society Technologies (IST)ProgrammeThe project aims at designing andexperimenting with metaphors for organisingstructuring and presenting thescientific and technical know-ledge offered to the publicimplementing Semantic WebtechnologiesSee httpcwebinriafrProjectsMesmusesE-mailscottiangalileoimssfirenzeit
DigiCULT 41
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
42 DigiCULT
DigiCULT is an IST Support Measure (IST-2001-34898) to establish a regular technology watch thatmonitors and analyses technological developmentsrelevant to and in the cultural and scientific heritagesector over the period of 30 months (032002-082004)
In order to encourage early take up DigiCULTproduces seven Thematic Issues three TechnologyWatch Reports along with the newsletterDigiCULTInfo
DigiCULT draws on the results of the strategicstudy lsquoTechnological Landscapes for TomorrowrsquosCultural Economy (DigiCULT)rsquo that was initiatedby the European Commission DG InformationSociety (Unit D2 Cultural Heritage Applications)in 2000 and completed in 2001
Copies of the DigiCULT Full Report andExecutive Summary can be downloaded or orderedat httpwwwdigicultinfo
For further information on DigiCULT pleasecontact the team of the project co-ordinator
Mr Guntram GeserguntramgesersalzburgresearchatPhone +43-(0)662-2288-303
Mr John Pereira johnpereirasalzburgresearchatPhone +43-(0)662-2288-247
Salzburg Research ForschungsgesellschaftJakob-Haringer-Str 5IIIA - 5020 Salzburg Austria Phone +43-(0)662-2288-200Fax +43-(0)662-2288-222httpwwwsalzburgresearchat
Project PartnerHATII - Humanities Advanced Technology andInformation InstituteUniversity of Glasgow httpwwwhatiiartsglaacukContact Mr Seamus Ross srosshatiiartsglaacuk
The members of the Steering Committeeof DigiCULT arePhilippe Avenier Ministegravere de la culture et de lacommunication FrancePaolo BuonoraArchivio di Stato di Roma ItalyCostis Dallas Critical Publics SA GreeceBert Degenhart-DrenthADLIB Information SystemsBVThe NetherlandsPaul Fiander BBC Information amp ArchivesUnited KingdomPeter Holm Lindgaard Library Manager DenmarkErich J Neuhold Fraunhofer IPSI GermanyBruce Royan Concurrent ComputingUnited Kingdom
DIGICULT PROJECT INFORMATION
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
DigiCULT Thematic Issue 1 - Integrity andAuthenticity of Digital Cultural Heritage Objectsbuilds on the first DigiCULT Forum held inBarcelona on May 6th 2002 in the context of theDLM-Conference 2002
DigiCULT Thematic Issue 2 ndash Digital AssetManagement Systems for the Cultural and ScientificHeritage Sector builds on the second DigiCULTForum held in Essen Germany on September 3rd2002 in the context of the AIIM Conference DMS EXPO
DigiCULT Thematic Issue 3 - Towards a SemanticWeb for Heritage Resources builds on the thirdDigiCULT Forum held on January 21st 2003 atFraunhofer IPSI Darmstadt Germany
DigiCULT Thematic Issue 4 will follow thefourth DigiCULT Forum on Learning Objectsthat will take place at the Koninklijke Bibliotheek -National Library of the NetherlandsThe Hagueon July 2nd 2003
IMPRINT
This Thematic Issue is a product of the DigiCULTProject (IST-2001-34898)
AuthorsGuntram Geser Salzburg ResearchJoost van Kasteren JournalistSeamus Ross University of Glasgow HATII Michael Steemson Caldeson Consultancy
ImagesImages for this Thematic Issue have been providedby and are reproduced with kind permission of theKoninklijke Bibliotheek ndash National Library of theNetherlandsThe Hague Netherlands
Graphics amp LayoutJan Steindl Salzburg Research
ISBN 3-902448-00-8Printed in Austriacopy 2003
DigiCULT 43
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
44 DigiCULT
IMAGES
Augustine La Citeacute de Dieu (Book 1-10)Paris c 1400-1410Volume IImage on p 5 from Fol 264r size 80x75illuminator Orosius Master ao
Bible Historiale Paris c 1320-1340Volume IImage on p 7 from Fol 2r size 45x50illum lsquoSub-Fauvelrsquo Master
Historic Bible Utrecht c 1430Volume IImages on pp 8 10 14 28from Fol 3r 3v 4v 7v size 55x85 to 60x85illumAlexander Master oa
Psalter Breviary of St Bridget Den BoschMonastery Marienwater Bridgettines 1468Images on pp 11 13Wednesday Matins Invitatoriumfrom Fol 219r
Jacob van Maerlant Der Naturen BloemeFlanders c 1350Images on pp 15 (Cerilius) 17 (Fastaleon)18 (Draco) 19 (Zitiron)from Fol 104rb1 106rb2 124r 111rasize 40x55 to 50x55
Jacob van Maerlant Spieghel HistoriaelWest Flanders c 1325-1335Image on p 21 from Fol 4va1size 45x55
Lambert of St Omer Liber FloridusLille and Ninove 1460Images of Signs of the Zodiacon pp 22 23 24 38 39 40 41
Psalter Normandy c 1180Image on p 26 from Fol 3vsize 160x125
Breviary Cambray() c 1275-1300Images on pp 27 29 31 33 34 35 37from Fol 211r size 205x135
copyKoninklijke BibliotheekThe HagueUsed with permission
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8
Towards a Semantic Web for
Heritage Resources
Thematic Issue 3 May 2003
DigiCULT Consortium
wwwdigicultinfo ISBN 3-902448-00-8