Download - Towards a semantic web
Towards a semantic web
Philip Hider
This talkThe Semantic Web vision
Scenarios
Standards
Semantic Web & RDA
Web 1.0, 2.0, 3.0Internet to WWW (Web 1.0)Web 1.0 allows people to navigate the Internet
easily, through hyperlinks
Web 2.0 allows people to collaborate more on the Web
Web 3.0 allows computers to find and use the datacontained in Web documents
Web 3.0 = the Semantic Web vision
The Semantic Web visionIt will allow computers to make sense of the
content of Web documents, so that they can find and use this data independently
Basis of SW already developed, with standards such as XML and RDF
Like Web 1.0, it represents a bottom-up, distributed approach
How would it work?Computers would be able to identify and ‘understand’
particular data in a Web document according to the metadata associated with that data metadata could be inside our outside the document
Computers (agents) would then be able to relate that data to other data in other documents (or the same document) according to specified schemas, ontologies and rules
They could then independently integrate data and process information according to tasks set by their human users
A Semantic Web scenarioUser asks ‘Trip Agent’ to purchase the ‘best’
deal for a trip to New Zealand with date range x, family members y, time of day z, etc. etc.
‘Trip agent’ searches the Web for flights and accommodation, and is able to look up databases and specify conditions according to what it ‘knows’ about user’s preferences
Semantic Web scenarioAgent is able to ‘understand’ the deals
available on different websites by integrating data from different sources, e.g. looking up geographic information systems (how far from the sea, shops, etc.), weather forecasts, family members’ calendars, etc. an ultimately suggesting the optimal combination of flight, hotel, tours, etc.
Another scenarioUser asks if the latest Stephen King
book is available in a nearby library, can’t remember what it’s called
‘Library Agent’ searches the Web for nearby libraries with books by ‘Stephen King’, finds a few different Stephen Kings, confirms with user which Stephen King, then identifies the latest novel via the official Stephen King website, but chooses the second-nearest library (by car) which holds it because of availability/format/library opening hours, etc.
What do SW agents need?Information about the data, i.e. metadata,
in a machine-readable format
Including a shared understanding of the structure of that metadata and its relationship to other knowledge structures (ontologies)
Some clever programming
Standards for the Semantic WebResource Description Framework Universal Resource IdentifiersXMLUnicodeSchemas (such as XML schemas) Ontologies written in e.g. OWLRules written in RIF, etc.SPARQL
Resource Description FrameworkW3C standard
A model used to structure resource descriptions
Can be used to structure data about any kind of resource could be a book, or a car, or a flight ticket, or an
experiment, etc.
Based on ‘triples’, i.e.
Resource – Property – Value
(Subject – Predicate – Object)
Universal Resource Identifiers
For example, URLs And ISBNs People don’t have them yetOCLC working on ‘work identifiers’Properties and some values are referenced as
part of particular schemas, ontologies, etc.
eXtensible Markup Language (XML) Another W3C standard More flexible than HTML, XHTML Can be used to encode any data Data can be in the same Web document or another
document Can be used to express RDF, i.e. RDF/XML RDF/XML basis for metadata structures such as
schemas and ontologies
SchemasStandardised structures of resource
description that define property elements in a taxonomic way
Mostly based on a particular domain, e.g. pertaining to bibliographic data, or geospatial data, or flight booking data, or used car data, etc.
SchemasTwo main groups of schemas –
XML schemas and RDFS (RDF schemas)
Superseding Document Type Definitions (DTDs)
Specific well-known schemas includeDublin CoreONIXRSS
Some metadata encoded in RDF/XML
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> <foaf:primaryTopic> <foaf:Person> <foaf:name>Tony Benn</foaf:name> </foaf:Person> </foaf:primaryTopic> </rdf:Description> </rdf:RDF>
Some metadata encoded in RDF/XML
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> <foaf:primaryTopic> <foaf:Person> <foaf:name>Tony Benn</foaf:name> </foaf:Person> </foaf:primaryTopic> </rdf:Description> </rdf:RDF>
OntologiesMore sophisticated than schemas, formalising
more complex relationships between elementsAlso usually domain-specificUse extra languages, such as OWL, on top of
RDF/XML etc. Ontologies give more scope for agents to be
‘clever’Dublin Core can be expressed as an ontology or
a schema
What about MARC?MARC files are rather flat and do not readily
define relationships between elementsBut can be expressed as an XML schema,
i.e. MARCXMLMODS is a lite version of MARCXMLMappings between MARCXML and other
schemas (e.g. DC)
MappingsLots of them!
Between different schemas, ontologies, languages, etc.
AKA crosswalks
By UKOLN, LC, OCLC, etc. etc.The more standards and adaptations, the
more crosswalks
Value setsResource – Property – Value
Schemas and ontologies may point to particular value sets, e.g.
Book A hasaSubjectcalled DCterms:LCSH Apples
where Apples is a value in the set of values known as LCSH
In other words, they may point to controlled vocabularies
SKOSSimple Knowledge Organization SystemsSW standard for expressing controlled
vocabularies such as subject thesaurihttp://www.w3.org/2004/02/skosMight promote use of LCSH, etc.
Semantic Web & cataloguingMore sophisticated use of library catalogues if
they can be understood by Semantic Web agents
Library resources more likely to be used in conjunction with non-library web resources
SW about agents using cataloguing, not replacing cataloguing
Semantic Web & RDARDA is therefore aligning itself with DC and
RDF
RDA elements mapped to DC, ONIX, etc.
DCMI/RDA Task Group
RDA-DC application profile
http://dublincore.org/dcmirdataskgroup
Prospects for SWExamples of Semantic Web developments:
http://www.w3.org/2001/sw/sweo/public/UseCases
A lot of standards now in place, technology not so much of an issue
With RDA, bibliographic domain ripe for SW take-up
Pre-SW library work
Post-SW library work
Thank you.