the semantic web. schedule for this evening review of the survey – summary. discussion if wanted...

69
The Semantic Web

Post on 22-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

The Semantic Web

Page 2: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Schedule for this evening• Review of the survey

– Summary. Discussion if wanted• Some other ways to move content from

place to place– FTP– OAI – PMH

• Then, the Semantic Web– An introduction to things to come

Page 3: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Survey• Summary on Word document• Responses and any comments

Page 4: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Other ways to move materials in the Internet

• FTP – File Transfer Protocol– One of the oldest of the Internet protocols– Originally, command line interface– Now, many GUI versions

• Host must run a server version that listens on port 20 (default)

• Client requests a session, user logs in, issues a sequence of commands including get and put.

Brief demonstration

Page 5: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Open Archives Intiative• Generally oriented toward sharing

information about resources in collections accessible on the Internet

• There is a protocol for sharing – Based on XML so we will look at that first

Page 6: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Semantic Web• Semantics refers to meaning.• The semantic web aims to have enough

information about a resource available that a program can use resources as if the program could understand what the resources are.– Of course, the program does not really

“understand” in the human sense.– However, if it has enough information, it can

follow rules and behave in ways that are consistent with understanding what it is working with.

Page 7: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Markup• HTML is a markup language

– not the first, by any means• Tags in HTML give clues to the reader

(browser or other program) about what to do in displaying or presenting the marked text.– emphasize, make stand out (like a title or section

head), break – Some allowance for meta tags

• HTML has been stretched beyond its original design

Page 8: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

XML• Simplified version of SGML

– Language for defining languages (markup languages)

– HTML is now XHTML and is an XML language

– XML allows you to make up your own descriptive language

Page 9: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Metadata• Critical part of the description of content

and resources• What does metadata look like?• Metadata is data about data

– Information about a resource, encoded in the resource or associated with the resource.

• The language of metadata: XML– eXtensible Markup Language

Page 10: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

XML• XML is a markup language• XML describes features• There is no standard XML• Use XML to create a resource type• Separately develop software to interact

with the data described by the XML codes.

Source: tutorial at w3school.com

Page 11: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

XML rules• Easy rules, but very strict• First line is the version and character set

used: – <?xml version="1.0" encoding="ISO-8859-1"?>

• The rest is user defined tags• Every tag has an opening and a closing

Page 12: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Element naming

• XML elements must follow these naming rules:– Names can contain letters, numbers, and other

characters– Names must not start with a number or

punctuation character– Names must not start with the letters xml (or XML

or Xml ..)– Names cannot contain spaces

Page 13: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Elements and attributes

• Use elements to describe data• Use attributes to present information

that is not part of the data– For example, the file type or some other

information that would be useful in processing the data, but is not part of the data.

Page 14: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Repeating elements

• Naming an element means it appears exactly once.

• Name+ means it appears one or more times

• Name* means it appears 0 or more times.

• Name? Means it appears 0 or one time.

Page 15: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Parts of an XML document• Elements

– The components of an XML document– Some contain other parts, some are empty

• Ex in HTML: “br” or “table” in XML “ingredient”

• Attributes– Information about elements, not data

• Ex in HTML “src=” in XML “scale=”

• Entities– Special characters or strings with pre-assigned meaning

• Ex in HTML &nbsp for non-breaking space

• PCDATA– Parsed Character data: text that will be parsed and interpreted by the

reader. Tags and entities will be expanded and used in presentation.• CDATA

– Character data: text that will not be parsed and interpreted. It will be displayed exactly as provided.

The HTML examples are familiar; the XML examples are made up – dependent on the specific XML scheme used

Page 16: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Using XML - an example

Define the fields of a recipe collection:<?xml version="1.0" encoding="ISO-8859-1"?><recipe><recipe-title> </recipe-title><ingredient-list> <ingredient> <ingredient-amount> </ingredient-amount> <ingredient-name> </ingredient-name> </ingredient></ingredient-list><directions></directions></recipe> ISO 8859 is a character set.

See http://www.bbsinc.com/iso8859.html

Page 17: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Processing the XML data• How do we know what to do with the

information in an XML file?– Document Type Definition (DTD)

• Put in the same file as the data -- immediate reference

• Put a reference to an external description• Provides the definition of the legitimate content

for each element

Page 18: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Document Type Definition

• <?xml version="1.0" encoding="ISO-8859-1"?>• <!DOCTYPE recipe [• <!ELEMENT recipe (recipe-title, ingredient-list, directions)>• <!ELEMENT recipe-title (#PCDATA)>• <!ELEMENT ingredient-list (ingredient)>• <!ELEMENT ingredient (ingredient-amount, ingredient-name)*>• <!ELEMENT ingredient-amount (#PCDATA)>• <!ELEMENT ingredient-name (#PCDATA)>• <!ELEMENT directions (#PCDATA)> ]>

Repeat 0 or more times

Page 19: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE recipe SYSTEM “recipe.dtd”><recipe><recipe-title> Meringue cookies</recipe-title><ingredient-list> <ingredient> <ingredient-amount>3 </ingredient-amount> <ingredient-name> egg whites</ingredient-name> </ingredient> <ingredient> <ingredient-amount> 1 cup</ingredient-amount> <ingredient-name> sugar</ingredient-name> </ingredient> <ingredient> <ingredient-amount>1 teaspoon </ingredient-amount> <ingredient-name> vanilla</ingredient-name> </ingredient> <ingredient> <ingredient-amount>2 cups </ingredient-amount> <ingredient-name>mini chocolate chips </ingredient-name> </ingredient></ingredient-list><directions>Beat the egg whites until stiff. Stir in sugar, then vanilla. Gently fold in

chocolate chips. Place in warm oven at 200 degrees for an hour. Alternatively, place in an oven at 350 degrees. Turn oven off and leave overnight.

</directions> </recipe>

Not the way that I want to see a recipe in a magazine!

What could we do with a large collection of such entries?

How would we get the information entered into a collection?

External reference to DTD

Page 20: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Spot Check• Design an XML schema for an application

of your choice. Keep it simple.• Examples -- address book, TV program

listing, DVD collection, …

• Work in pairs and discuss your choice and your solution

Page 21: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Anot

her e

xam

ple

• A paper with content encoded with XML: http://tecfaseed.unige.ch/staf18/modules/ePBL/uploads/proj3/paper81.xml

• First few lines:• <?xml version="1.0" encoding="ISO-8859-1"?>• <?xml-stylesheet href="ePBLpaper11.css" type="text/css"?>• <?xml-stylesheet href="ePBLpaper11.xsl" type="text/xsl"?>• <!DOCTYPE paper SYSTEM "ePBLpaper11.dtd">• <paper id="proj3">• <info>• <title>Standards E-learning and their possible support for a rich

pedagogic approach in a 'Integrated Learning' context</title>• <authors>• <author>• <firstname>Rodolophe</firstname>• <familyname>Borer</familyname>• <homepageurl>http://tecfa.unige.ch/perso/staf/borer/</

homepageurl>• <email/>• </author>• </authors> "ePBLpaper11.dtd” shown on next slide

Page 22: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

<?xml version="1.0" encoding="ISO-8859-1" ?><!-- _________ _____________________ --><!-- ePBL-project DTD for student project management

& specification --><!-- Copyright: (2004)

[email protected] --><!-- http://tecfa.unige.ch/~paraskev/ --><!-- Daniel K. Schneider --><!-- http://tecfa.unige.ch/tecfa-people/schneider.html--><!-- Created: 13/11/2002 (based on EVA_pm grammar) --><!-- Updated: 07/05/2004 --><!-- VERSIONS --><!-- v1.1 Adaptations to use with Morphon xml editor

and addition of IDs--><!-- ____________________ --><!-- _ ENTITY DECLARATIONS ______ --><!ENTITY % foreign-dtd SYSTEM "ibtwsh6_ePBL.dtd">%foreign-dtd;<!ENTITY % id "id ID #IMPLIED"><!-- ______ MAIN ELEMENT _________ --><!ELEMENT project (name, authors, date, updated,

goal, state-of-the-art, research-development-questions, methodology, workpackages ) >

<!ELEMENT name (#PCDATA )><!ELEMENT date (#PCDATA )><!ELEMENT authors (#PCDATA )>

<!ELEMENT updated (#PCDATA )><!ELEMENT goal (title, description )><!ELEMENT state-of-the-art %vert.model;><!ATTLIST state-of-the-art %id;><!ELEMENT research-development-questions (question )

+>

<!ELEMENT question (title, description )><!ELEMENT methodology %vert.model;><!ATTLIST methodology %id;><!ELEMENT workpackages (workpackage )+><!ELEMENT workpackage (planning, objectives,

deliverables )><!ATTLIST workpackage %id;><!ELEMENT objectives (objective )+><!ELEMENT objective (title, description )><!ELEMENT deliverables (deliverable )+><!ELEMENT deliverable (url, title, description )><!ELEMENT url (#PCDATA )><!ELEMENT planning (from, to, progress )><!ELEMENT from (#PCDATA )><!ELEMENT to (#PCDATA )><!ELEMENT progress (#PCDATA )><!-- ________________________ --><!ELEMENT title (#PCDATA )><!ATTLIST title %id;><!ELEMENT description %vert.model;><!-- _______________________ -->

Source: http://tecfa.unige.ch/staf/staf-j/vuilleum/staf18/p6/No longer there

Page 23: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Resource sharing• On your projects, you had to go looking for

the materials that you need• You look at the site, see what is there,

consider how it could be used in your project.

• On a large scale, that does not work so well.

• It would be nice to query a site and ask what is there that might be of interest to us.

Page 24: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Distributed ResourcesMultiple Services

Service provider -- search, browse, compare, etc.

Data provider

Data provider

Data provider

Data provider

Data provider

One service provider gathers information about data and uses it to provide services

Page 25: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Open Archives Initiative (OAI)

• Web-based– Uses HTTP to communicate between sites

• Centralized server– Services provided from a site that has

already gathered the information it needs for those services from a distributed collection of sites.

Page 26: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

OAI PMH

• Interoperability through Metadata Exchange

• The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP.

http://www.openarchives.org/pmh/

Page 27: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

OAI PMH verbs• Identify• ListMetadataformats• ListSets• Listidentifiers• Listrecords• Getrecord

Page 28: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Open Archives Initiative Protocol for Metadata Harvesting -- OAI-PMH

Repository

OAI

Harvester

OAI

HTTP req (OAI verb)

HTTP resp (XML)

OAI PMH defines an interface between the Harvester and any number of Repositories

Metadata Provider

Service Provider

Implemented as CGI, ASP, PHP, or other

Any system may serve as a harvester, repository, or both

Page 29: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

OAI - PMH componentsService Providers and Data Providers

Requests and Responses

http://www.oaforum.org/tutorial/english/page3.htm#section3

Page 30: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Records• Metadata of a resource.• Three parts

– Header (required)• Identifier (required: 1 only)• Datestamp (required: 1 only)• setSpec elements (optional: 0, 1, or more)• Status attribute for deleted item

– Metadata (required)• XML encoded metadata with root tag, namespace• Repositories must support Dublin Core, other formats

optional– “About” statement (optional)

• Right statements• Provenance statements

Page 31: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Dublin Core elementssee: http://dublincore.org/documents/dces/

• Title• Creator • Subject - C• Description• Publisher• Contributor• Date • Type - C

• Format - C• Identifier• Source• Language• Relation• Coverage - C

• RightsRights Management information

Space, time, jurisdiction.

C = controlled vocabulary recommended.

Ref. to related resource

Standards RFC 3066, ISO639

Unambiguous ID

Ex: collection, dataset, event, image

YYYY-MM-DD, ex.

Entity primarily responsible for making content of the resource

Entity making the resource available

Contributor to content of the resource

What is needed to display or operate the resource.

Page 32: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Identifiers• Globally unique identifier• Valid URI

– Examples• oai:<archiveId>:<recordId>• oai:etd.vt.edu:etd-1234567890

– Must resolve to one item• No duplicates• No reuse of previously used identifiers

Page 33: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Datestamps• Date of last modification of a record

– Used only for harvesting (meta metadata?)• Mandatory for each item in the repository• Two levels of granularity possible

– YYYY-MM-DD– YYYY-MM-DDThh:mm:ssZ

• T … Z = Time zone -- must be GMT

• Allows harvesting incrementally -- get only what is new since last visit– Accessed by arguments from and until

Page 34: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

The OAI-PMH verbs• Each requests a specific response from a

data repository

Page 35: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Identify• Function: Description of the archive• Example: http://www.language-archives.org/cgi-bin/olaca3.pl?verb=Identify• Parameters: none• Errors/exceptions:

– badArgument (there should not be any)• Response format:Element Example

Ordinality ‡repositoryName My Archive

1baseURL http://archive.org/oai

1protocolVersion 2.0

1earliestDatestamp 1999-01-01

1deleteRecords no, transient, persistent

1granularity YYYY-MM-DD, YYYY-MM-DDThh:mm:ssZ

1adminEmail [email protected]

+compression deflate, compress

*description oai-identifier, eprints, friends, …

* ‡ Ordinality: 1 = mandatory, 1 only; + = mandatory, 1 only; * = optional, 0 or more

Page 36: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Actual response from

http://www.language-archives.org/cgi-bin/olaca3.pl?verb=Identify

Continued

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2011-11-13T02:01:52Z</responseDate><request verb="Identify">http://www.language-archives.org/cgi-bin/olaca3.pl</request><Identify><repositoryName>OLAC Aggregator</repositoryName><baseURL>http://www.language-archives.org/cgi-bin/olaca3.pl</baseURL><protocolVersion>2.0</protocolVersion><adminEmail>[email protected]</adminEmail><earliestDatestamp>1900-01-01</earliestDatestamp><deletedRecord>no</deletedRecord><granularity>YYYY-MM-DD</granularity><!-- maybe later <compression>identity</compression> --><description><oai-identifier xmlns="http://www.openarchives.org/OAI/2.0/oai-identifier" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai-identifier http://www.openarchives.org/OAI/2.0/oai-identifier.xsd">

Page 37: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Continued

<scheme>oai</scheme><repositoryIdentifier>OLACA.language-archives.org</repositoryIdentifier><delimiter>:</delimiter><sampleIdentifier>oai:ethnologue.com:aaa</sampleIdentifier></oai-identifier></description><description><olac-archive xmlns="http://www.language-archives.org/OLAC/1.1/olac-archive" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" type="institutional" xsi:schemaLocation="http://www.language-archives.org/OLAC/1.1/olac-archive http://www.language-archives.org/OLAC/1.1/olac-archive.xsd" currentAsOf="2011-10-31"><archiveURL>http://www.language-archives.org/archive_records/</archiveURL><participant name="Steven Bird" role="Curator" email="[email protected]"/><participant name="Gary Simons" role="Curator" email="[email protected]"/><participant name="Haejoong Lee" role="Administrator" email="[email protected]"/><institution>Open Language Archives Community</institution><institutionURL>http://www.language-archives.org/</institutionURL><shortLocation>Philadelphia, U.S.A.</shortLocation><location/>

Page 38: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

<synopsis>This repository contains all records from OLAC-registered archives. It is intended to be used by services which do not want to harvest individual OLAC archives.</synopsis><access>Metadata may be used only subject to the access permissions given by the individual archives.</access></olac-archive></description></Identify></OAI-PMH>

Page 39: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

ListMetadataFormats

• Function: retrieve available metadata formats from archive

• Example: archive.org/oai-script?verb=ListMetadataFormats&• identifier=oai:HUBerlin.de:3000218

• Parameters: identifier (optional)• Errors/exceptions:

– badArgument– idDoesNotExist– noMetadataFormats

Page 40: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

− <OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2006-10-17T01:58:06Z</responseDate><request verb="ListMetadataFormats">http://www.language-archives.org/cgi-bin/olaca3.pl</request>− <ListMetadataFormats>− <metadataFormat><metadataPrefix>olac</metadataPrefix><schema>http://www.language-archives.org/OLAC/1.0/olac.xsd</schema><metadataNamespace>http://www.language-archives.org/OLAC/1.0/</metadataNamespace></metadataFormat>− <metadataFormat><metadataPrefix>olac_display</metadataPrefix><schema>http://www.language-archives.org/OLAC/1.0/olac.xsd</schema><metadataNamespace>http://www.language-archives.org/OLAC/1.0/</metadataNamespace></metadataFormat>− <metadataFormat><metadataPrefix>oai_dc</metadataPrefix><schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema><metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace></metadataFormat></ListMetadataFormats></OAI-PMH>Re

spon

se to

htt

p://

ww

w.la

ngua

ge-a

rchi

ves.

org/

cgi-b

in/

olac

a3.p

l?ve

rb=L

istM

etad

ataF

orm

ats

Page 41: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

ListSets• Function: retrieve set structure of a repository

• Example: archive.org/oai-script?verb=ListSets• Parameters: resumptionToken (exclusive)• Errors/exceptions:

– badArgument– badResumptionToken– noSetHierarchy

Sets are optional and are used to divide a repository into separate units that will be of interest to different harvesters.

Page 42: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

ListIdentifiers• Function: abbieviated form of ListRecords, retrieve only

headers• Example: archive.org/oai-script?verb=ListIdentifiers&metadataPrefix=

oai_dc&from=2002-12-01

• Parameters:– from (optional)– until (optional)– metadataPrefix (required)– set (optional)– resumptionToken (exclusive)

• Errors/exceptions:– badArgument– badResumptionToken– cannotDisseminateFormat– noRecordsMatch– noSetHierarchy

Page 43: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

ListRecords• Function: harvest records from a repository• Example: archive.org/oai-script?verb=ListRecords&

metadataPrefix=oai_dc&set=biology• Parameters:

– from (optional)– until (optional)– metadataPrefix (required) – set (optional)– resumptionToken (exclusive)

• Errors/exceptions:– badArgument– badResumptionToken– cannotDisseminateFormat– noRecordsMatch– noSetHierarchy

Page 44: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

GetRecord• Function: retrieve an individual metadata record

from a repository• Example:archive.org/oai-script?verb=GetRecord&identifier=oai:HUBerlin.de:

3000218 &metadataPrefix=oai_dc

• Parameters:– Identifier (required)– metadataPrefix (required)

• Errors/exceptions:– badArgument– cannotDisseminateFormat– idDoesNotExist

Page 45: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –
Page 46: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –
Page 47: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Interoperability• The goal: communication, without human

intervention, between information sources– Books that “talk to each other”

• Live links for references• Knowledge of how to find relevant resources

when needed• Ability to query other information locations

Page 48: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Protocols• Precise rules for interactions between

independent processes– Format of the messages

• Both structure and content

– Specified behavior in response to specific messages

• Many ways to accomplish the same result, but both sides must have the same understanding of the rules of engagement.

Page 49: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Spot Check• Make up a protocol• Suppose we wanted a kind of command and

control protocol so that a master site could cause a satellite site to clear the screen that is displayed to the web.

• We want the response to be prompt• We want the satellite site to confirm receipt of

the command and to notify the master when the site screen has been cleared.

• It should be possible to accomplish this with messages between the two sites and an action at the satellite site.

Page 50: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

The Semantic Web• Some of these slides come from Lee

Giles – Who, in turn, credits Jim Hendler, Carl

Lagoze, Jayavel Shanmugasundaram, Sara Cohen, Jonathan Mamou, Yaron Kanza, Mark Sapossnek, Yehoshua Sagiv, Frank van Harmelen

Page 51: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Beyond XML• Building with XML, new languages have

emerged to– Describe content, and things in general– Relationships between things– Attributes (characteristics) of things

• The semantic web requires that things be described in sufficient detail that autonomous processes can discover useful things and use them properly

Page 52: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Motivation for the Semantic Web

• Search engines• concepts, not keywords• semantic narrowing/widening of queries

• Shopbots• semantic interchange, not screenscraping

• E-commerce– Negotiation, catalogue mapping, personalization

• Web Services– Need semantic characterizations to find them

• Navigation• by semantic proximity, not hardwired links

• .....

Page 53: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Example• Try these queries with Google:

– Distance between Paris and Madrid Google returns:

– (The) Largest city of France • Google returns: France – Largest City: Paris

– (The) Largest city of Spain • Google returns: Spain – Largest City: Madrid

• Now, try these with Google:– Distance between largest city of France and largest city of Spain– Distance between “largest city of France” and “largest city of

Spain”– And worst, Distance between “the largest city of France” and

“the largest city of Spain” – No result returned by Google!• Actually now shows a link to several versions of these slides!

Distance between Madrid spain and Paris francewww.mapcrow.info/Distance_between_Madrid_SP_and_Paris_FR.htmlCOORDINATES +. TOTAL DISTANCE. Madrid, SP, -3.6833 40.4000. Paris, FR, 2.3333 48.8667. Miles: 654.57. Kilometers: 1053.40. Bearing: NE. Madrid, SPAIN ...

Page 54: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

http://www.w3.org/DesignIssues/diagrams/sw-stack-2005.png

Semantic Web Stack

Page 55: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

RDF and OWL• Resource Description Framework (RDF)• Web Ontology Language (OWL)

Page 56: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

So why not just use XML?• No agreement on:

– structure• is country a:

– object?– class?– attribute?– relation? – something

else?• what does

nesting mean?– vocabulary

• is country the same as nation?

<country name=”Netherlands”> <capital name=”Amsterdam”> <areacode>020</areacode> </capital></country>

<nation> <name>Netherlands</name> <capital>Amsterdam</capital> <capital_areacode> 020 </capital_areacode></nation>

● Are the above XML documents the same?● Do they convey the same information?● Is that information machine-accessible?

Page 57: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

“2nd aim of Semantic Web”: Data integration

– Unstructured and sensors, programs, services semi-structured sources (document collections, message traffic, web pages, ...)

– Structured data without an explicit data schema (non-local databases, data tables, charts and reports, ...)

– Non-Text collections (image, video, sound, ...) – Streams of data

Must specify the structure of data resources..

Page 58: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

2nd aim of Semantic Web: Data integration

... so a processor can tell how the "attributes" and "values" are related

– What is required vs. optional? – How many values for a particular attribute? – What attributes are keys for other

attributes? – Which attributes are necessarily related to

other attributes and in what way?? – How do the attributes (and values) in one

data source map to attributes and values describing another source?

Page 59: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Stack of languages• XML:

– Surface syntax, no semantics• XML Schema:

– Describes structure of XML documents• RDF:

– Datamodel for “relations” between “things”• RDF Schema (RDFS):

– RDF Vocabulary Definition Language• OWL:

– A more expressive Vocabulary Definition Language

Page 60: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Semantic web languages today• Today there are three semantic web

languages– RDF – Resource Description Framework

http://www.w3.org/RDF/– DAML+OIL – Darpa Agent Markup Language

http://www.daml.org/ (deprecated)– OWL – Ontology Web Language

http://www.w3.org/2001/sw/• OWL lit• OWL DL• OWL Full

Page 61: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

RDF is the first Semantic Web language

<rdf:RDF ……..> <….> <….></rdf:RDF>

XML EncodingGraph

stmt(docInst, rdf_type, Document)stmt(personInst, rdf_type, Person)stmt(inroomInst, rdf_type, InRoom)stmt(personInst, holding, docInst)stmt(inroomInst, person, personInst)

Triples

RDFData Model

Good for MachineProcessing

Good For HumanViewing

Good For Reasoning

RDF is a simple language for building graph based representations

Page 62: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

The RDF Data Model• An RDF document is an unordered collection of

statements, each with a subject, predicate and object (aka triples)

• A triple can be thought of as a labelled arc in a graph• Statements describe properties of web resources• A resource is any object that can be pointed to by a URI:

– a document, a picture, a paragraph on the Web, …– E.g., http://umbc.edu/~ypeng/F07671.html – a book in the library, a real person (?)– isbn://5031-4444-3333– …

• Properties themselves are also resources (URIs)

Page 63: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –
Page 64: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

RDF without a Schema• Object ->Attribute-> Value triples

• objects are web-resources• Value is again an Object:

• triples can be linked• data-model = graph

pers05 ISBN...Author-of

pers05 ISBN...Author-of

MIT

ISBN...

Publ-by

Author-of Publ-

by

Page 65: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

What does RDF Schema add?• Defines vocabulary for RDF• Organizes this vocabulary in a

typed hierarchy• Class, subClassOf, type• Property, subPropertyOf• domain, range

Person

Author Reader

subClassOfsubClassOf

Lynda

type

communicatesTodomain range

Frank

type

communicatesTo

Page 66: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Which Semantic Web?

• Version 1:"Semantic Web as Web of Data" (TBL)

• recipe:expose databases on the web, use XML, RDF, integrate

• metadata from:– expressing DB schema semantics

in machine interpretable ways• enable integration and unexpected re-use

Page 67: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Which Semantic Web?

• Version 2:“Enrichment of the current Web”

• recipe:Annotate, classify, index

• metadata from:– automatically producing markup:

named-entity recognition, concept extraction, tagging, etc.

• enable personalization, search, browse,..

Page 68: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

Which Semantic Web?

• Version 1:“Semantic Web as Web of Data”

• Version 2:“Enrichment of the current Web”

Different use-cases Different techniques Different users

Page 69: The Semantic Web. Schedule for this evening Review of the survey – Summary. Discussion if wanted Some other ways to move content from place to place –

The Evolving WebWeb ofKnowledge

HyperText Markup LanguageHyperText Transfer Protocol

Resource Description FrameworkeXtensible Markup Language Self-Describing Documents

Foundation of the Current Web

Proof, Logic andOntology Languages Shared terms/terminology

Machine-Machine communication

1990

2000

2010

Berners-Lee, Hendler; Nature, 2001

DOCUMENTS

DATA/PROGRAMS