semantic web
DESCRIPTION
Presented to GASP - The Portuguese Group of Software Architects (November 2008)TRANSCRIPT
Semantic WebArchitecture and Technology
António Cruz, Software Architect
Agenda
• Vision• Architecture• Technology
Vision
Vision
“I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A Semantic Web, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The intelligent agents people have touted for ages will finally materialize”.
Tim Berners-Lee, 1999
What is Semantics?
• Semantics is the study of meaning in communication (Wikipedia).
What is the Semantic Web?
• It is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a form that can be understood, interpreted and used by software agents.
• The agents find each other and interchange and interpret data through ontologies.
• Some elements of the semantic web vision are yet to be implemented or realized.
What is an Ontology?
• It is the study of the nature of being, existence, or reality in general and of its basic categories and their relations, with particular emphasis on determining what entities exist or can be said to exist, and how these can be grouped and related within an ontology (typically, a hierarchy subdivided according to similarities and differences).
“Anyone can say anything about anything”
• The main power of Semantic Web languages is that any one can create one, simply by publishing some RDF that describes a set of URIs, what they do, and how they should be used.
• The power that you get from publishing your information in RDF is that once published in the public domain, it can be repurposed (used for other things) so much easier. It's Do It Yourself data management.
Why use RDF?
• You should use RDF because if you do, then we can reuse your data much more conveniently.
or
• The point is that you incur costs now so that others can benefit later.
Architecture
Top-Down & Bottom-Up• Top-down: we start from the top and work our
way down, using natural language processors to read existing Web documents and extract semantic metadata.
• Bottom-up: we start from the bottom and work our way to the top by using a method like embedding RDF into Web documents to supply user agents with meta data. We are already seeing this type of action being taken by bloggers and other content creators.
Pros & Cons• “Achieving powerful reasoning with reasonable complexity”
is the ultimate goal.• Browse versus search is a radical increase in the trust we
put in link infrastructure, and in the degree of power derived from that link structure. Browse says the people making the ontology, the people doing the categorization, have the responsibility to organize the world in advance.
• The search paradigm says the reverse. It says nobody gets to tell you in advance what it is you need. Search says that, at the moment that you are looking for it, we will do our best to service it based on this link structure, because we believe we can build a world where we don't need the hierarchy to coexist with the link structure.
Semantic Web vs. semantic web
Semantic Web semantic web
Philosophy
Build common data format for expressing the meaning of data. Use ontologies to help machines to understand web content.
Humans first, machines second. Encode existing Web content with special tags.
Language RDF, RDFS, OWLBased on XHTML tags: micro-formats
Format Must be well-formed RDF documents
Anything goes, as long as its XHTML
SemanticDefined by the underlying ontology model (e.g., OWL)
Loosely defined. No formal semantic model.
Examples FOAF, OWL-S, OWL-Time
XFN (social network), hCard (contact), hReview (opinions), rel-tag (taggging)
Semantic Web Stack
Knowledge Reference Model
Trust and Proof
• Applications on the Semantic Web will depend on context generally to let people know whether or not they trust the data.
• What happens when there is a party that we know, but we don't know how to verify that a certain RDF data came from them? That's where digital signatures come in.
The Linking Open Data dataset cloud
(in WSRI - http://webscience.org/)
Technology
RDF Sample
It’s All About The Context
• - US citizens are people - The First Amendment covers the rights of US citizens - Nike is protected by the First Amendment
The value of Syllogisms• The people working on the Semantic Web greatly
overestimate the value of deductive reasoning (a persistent theme in Artificial Intelligence projects generally.) The great popularizer of this error was Arthur Conan Doyle, whose Sherlock Holmes stories have done more damage to people's understanding of human intelligence than anyone other than Rene Descartes. Doyle has convinced generations of readers that what seriously smart people do when they think is to arrive at inevitable conclusions by linking antecedent facts. As Holmes famously put it "when you have eliminated the impossible, whatever remains, however improbable, must be the truth."
Metadata
• Metadata describes a worldview.• There are different kinds of worlds.• Not every world is mappable onto another.
A Simple SPARQL Query• Data:
<http://example.org/book/book1><http://purl.org/dc/elements/1.1/title>"SPARQL Tutorial" .
• Query:SELECT ?titleWHERE { <http://example.org/book/book1><http://purl.org/dc/elements/1.1/title>?title . }
• Result: "SPARQL Tutorial"
It’s All About The Context
• - US citizens are people - The First Amendment covers the rights of US citizens - Nike is protected by the First Amendment
DemoTop-Bottom
Q&A