the semantic web

44
1 cs236607

Upload: jerome-chapman

Post on 30-Dec-2015

26 views

Category:

Documents


1 download

DESCRIPTION

The Semantic Web. The Need. “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” Berners-Lee, T, Hendler , J & Lassila , O ‘The semantic web’, Scientific American , May 2001. Semantic Processing. - PowerPoint PPT Presentation

TRANSCRIPT

1cs236607

The Need

“Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.”

Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001

2cs236607

Semantic ProcessingWe want to be able to pose complex search

tasks that use the semantics of pieces of information, e.g.,

I want to purchase a DVDof “Dore the Explorer” at a price lower than 10$.Is such a CD available atamazon.com?

3cs236607

Current search agents are not suitable for such task

4cs236607

Current Solution

Use “intelligent” agents

The Semantic-Web Approach

Content is machine-understandable by being bound to some formal description of itself (i.e. metadata)

5cs236607

GoalsWeb of data - provides common data

representation framework to facilitate integrating multiple sources to draw new conclusions

Increase the utility of information by connecting it to its definitions and to its context

More efficient information access and analysis

cs236607 6

ApplicationsAgents that search the Web and retrieve

valuable information to the end userWeb services that publish their informationPrograms that try to integrate data of

different web services and to produce new results or draw new conclusions from the integrated data

cs236607 7

Ontologies & Inference Engines

“For the semantic web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning.”

Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001

cs236607 8

The Four Building Blocks

1. XML2. RDF3. Ontologies4. Agents

cs236607 9

XML

“XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean”

cs236607 10

RDF –Resource Description FrameworkMeaning encoded in sets of ‘triples’: entities

have properties which have valuesEntities, properties and values all have

distinct URIs

cs236607 11

“imagine that we have access to a variety of databases with information about people, including their addresses. If we want to find people living in a specific zip code, we need to know which fields in each database represent names and which represent zip codes. RDF can specify that "(field 5 in database A) (is a field of type) (zip code)," using URIs rather than phrases for each term.” Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001a

OntologiesDatabase A and Database B may use

different fields to contain ‘zip code’Ontologies sort this outOntology = ‘a document or file that

formally defines the relations among terms’

Ontologies for the web normally haveA taxonomyA set of inference rules

cs236607 12

Agents

“Agent based computing appears to be the appropriate paradigm to work in a complex world with multiple ontologies, fragments and multiple inferencing engines.”

Stork, Hans-Georg and Mastroddi, Franco, Semantic Web Technologies - a New Action Line in the European Commission’s IST Programme, 2001

cs236607 13

The Power of Agents - Integration“The real power of the Semantic Web will be realized when people create many programs that collect Web content from diverse sources, process the information and exchange the results with other programs. The effectiveness of such software agents will increase exponentially as more machine-readable Web content and automated services (including other agents) become available.”

Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001

cs236607 14

‘Ambient Intelligence’

“In the next step, the Semantic Web will break out of the virtual realm and extend into our physical world. URIs can point to anything, including physical entities, which means we can use the RDF language to describe devices such as cell phones and TVs.”Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001

cs236607 15

Resource Description Framework

cs236607 16

17

What is RDF?A part of the semantic-Web activityRDF is a general-purpose language for

representing information on the WebSpecifically, objects and relationships

Designed to allow computer applications to process data based on its semanticsRather than displaying data to humans (as

opposed to RSS)

An RDF document is actually a labeled graph that is represented in XMLThe specific language is called RDF/XMLW3C recommendation (Feb. 2004)

RDF Data Consists of TripletsRDF data is a set of statementsEach statement is a triplet (Resource,

Property, Value)Sometimes we refer to a triplet using the

terminology of (Subject, Predicate, Object)

cs236607 18

The author of http://www.cs.technion.ac.il/kanza/myPage

.html is Yaron Kanza

Resource (Subject): myPage.htmlProperty (Predicate): author Value (Object): Yaron Kanza

19

RDF Data

Subject Objectpredicate

The basic element: Triple (labeled edge)

Person#845

#1002

address

postalCode

6941Haifa

city

Herzel

street

RDF document: edge-labeled graph

Statement

20

page.html

John Smith

John’s Home Page

DC:Creator

DC:Title

<?xml version=“1.0”?><rdf:RDF

xmlns:rdf=“http://www.w3.org/TR/WD-rdf-syntax#”xmlns:dc=“http://purl.org/metadata/dublin_core#”>

<rdf:Description about=“page.html”> <dc:Creator>John Smith</dc:Creator> <dc:Title>John’s Home Page</dc:Title> </rdf:Description>

</rdf:RDF>

21

page.html

John SmithJohn’s Home Page [email protected]

dc:Title

dc:Creator

Name Email

. . .<Description about=“page.html”> <dc:Creator> <Description> <corp:Name>John Smith</corp:Name> <corp:Email>[email protected]</corp:Email> </Description> </dc:Creator> <dc:Title>John’s Home Page</dc:Title></Description>

</RDF>

ContainersGroups of things: <bag> <seq>

<alt><bag>: unordered list; duplicates

allowed<seq>: ordered list; duplicates

allowed<alt>: list of alternatives; one will

be selectedcs236607 22

23

A set of fifteen basic properties for describing generalized Web resources

The “obvious” mapping of Dublin Core properties into RDF properties has not yet been approved by the Dublin Core initiative, but is generally a good example

Dublin Core

24

“Title”: the name given to the resource“Creator”: the person or organization

primarily responsible for the resource“Subject”: what the resource is about“Description”: a description of the content“Publisher”: the person or organization

responsible for making the resource available“Contributor”: someone who has provided

content to the resource other than the creator“Date”: date of creation or publication

Dublin Core

25

“Type”: type of resource, such as home page, technical report, novel, photograph…

“Format”: data format of the resource“Identifier”: URL, ISBN number, …“Source”: another resource that this resource is derived

from“Language”: the language of the content“Relation”: another resource and its relationship to this

one“Coverage”: the portion of time or space described by

this resource (atlases, histories, etc.)“Rights”: the intellectual property rights adhering to

this resource, or a pointer to them

Dublin Core

26

Containers: bags, sequences, alternativesaboutEach, aboutEachPrefixReification (higher order statements)Namespaces and Vocabularies

Advanced RDF

27

Manually from HTML or “user domain XML”With special assisting tools – like Protégé,

Reggie, DC-dot, RDF for XMLIdeally – with some automated procedure

from HTML/XML documentsCan we use XSLT there?

Creating RDF documents

cs236607 28

RDF SchemaRDF Schema (RDFS) enriches the data model

of RDF, adding vocabulary and associated semantics forClasses and subclassesProperties and sub-propertiesTyping of properties

Support for describing simple ontologiesAdds an object-oriented flavorBut with a logic-oriented approach and using

“open world” semanticscs236607 29

30

Not an XML Schema!A “companion” specification for RDF specClass, Type, subClassOf,domain, rangeMisc: label, comment, isDefinedBy,etc.

RDF Schema

cs236607 31

<rdf:RDFxmlns:rdf= "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"xml:base= "http://www.animals.fake/animals#">

<rdf:Description rdf:ID="animal"> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/></rdf:Description>

<rdf:Description rdf:ID="horse"> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> <rdfs:subClassOf rdf:resource="#animal"/></rdf:Description>

</rdf:RDF>Horse is defined as subclass of animal

Example

Example<?xml version="1.0"?><rdf:RDF

xmlns:rdf= "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base= "http://www.animals.fake/animals#"><rdfs:Class rdf:ID="animal" /><rdfs:Class rdf:ID="horse">

<rdfs:subClassOf rdf:resource="#animal"/> </rdfs:Class>

</rdf:RDF>

cs236607 32

Abbreviated version. Works because an RDFS class is an RDF resource.

Use rdfs:Class instead of rdfDescription and drop the rdf:type information

RDF and RDF Schema

<rdf:RDF xmlns:g=“http://schema.org/gen” xmlns:u=“http://schema.org/univ”> <u:Chair rdf:ID=“john”> <g:name>John Smith</g:name> </u:Chair></rdf:RDF>

<rdfs:Property rdf:ID=“name”> <rdfs:domain rdf:resource=“Person”></rdfs:Property>

<rdfs:Class rdf:ID=“Chair”> <rdfs:subclassOf rdf:resource= “http://schema.org/gen#Person”></rdfs:Class>

u:Chair

John Smith

rdf:typeg:name

g:Person

g:name

rdfs:Class rdfs:Property

rdf:typerdf:type

rdf:type

rdfs:subclassOf

rdfs:domain

cs236607 34

RDF Schema is LimitedWe cannot express facts such as

Two classes are disjointBuild a class that is the union of two classesCardinality restrictionScope of propertiesProvide relationships between properties, such

as transitive, unique, inverse

cs236607 35

OWLA Web ontology language that is more

expressive than RDF and RDF SchemaWritten in XML on top of RDFUsing OWL we want to provide exact

descriptions of items and the relationships between them

Basically, built upon Description Logics

cs236607 36

SPARQL Protocol and RDF Query Language

cs236607 37

SPARQLSPARQL =

Query Language + Protocol + XML Results Format

• Access and query RDF graphs

• Product of the RDF Data Access Working Group

• We will only provide some examples and will not go over the entire definition of the language

cs236607 38

39

SPARQL QueryPREFIX dc: <http://purl.org/dc/elements/1.1/>

SELECT ?title2WHERE{ ?doc dc:title "SPARQL at speed" . ?doc dc:creator ?c . ?docOther dc:creator ?c . ?docOther dc:title ?title2

}

• On an abstracts/papers database:“Find other papers by the authors of a given paper.”

40

SPARQL QueryPREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX shop: <http://example/shop#>

SELECT ?titleWHERE{ ?doc dc:title ?title . FILTER regex(?title, "SPARQL") . ?doc dc:creator ?c . ?c foaf:name ?name . OPTIONAL { ?doc shop:price ?price }}

• “Find books with ‘SPARQL’ in the title. Get the authors’ name and the price (if available).”

• Multiple vocabularies

41

Inference• An RDF graph may be backed by inference

− OWL, RDFS, application, rules

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>SELECT ?typeWHERE{ ?x rdf:type ?type .}

:x rdf:type :C .:C rdfs:subClassOf :D .

--------| type |========| :C || :D |--------

42

Another ExamplePREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX ldap: <http://ldap.hp.com/people#>PREFIX foaf:

SELECT ?name ?email{ ?doc dc:title ?title . FILTER regex(?title, “SPARQL”) . ?doc dc:creator ?reseacher . ?researcher ldap:email ?email . ?researcher ldap:name ?name}

• “Find the name and email addresses of authors of a paper about SPARQL”

Linkshttp://www.w3.org/RDF/http://www.w3.org/TR/2004/REC-rdf-primer-

20040210/http://www.w3.org/TR/rdf-schema/http://www.w3.org/TR/2004/REC-owl-ref-

20040210/

cs236607 43

SPARQL LinksJena: Java and .Net Semantic Web

Frameworkhttp://jena.sourceforge.net/

SPARQL Queryhttp://jena.sourceforge.net/ARQ

SPARQL Protocolhttp://www.joseki.org

SquirrelRDF: Access legacy SQL:http://jena.sourceforge.net/SquirrelRDF

cs236607 44