@ ebiquity lab, csee, umbc swoogle tutorial (part i: swoogle r d) a brief introduction to swoogle...

36
eBiquity Lab, CSEE, UMBC @ Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle development Presented by eBiquity Lab, CSEE, UMBC

Upload: hortense-simmons

Post on 18-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

@ SwoogleSwoogle SwoogleSwoogle eBiquity Lab, CSEE, UMBC Motivation (Google + Web) has made us all smarter something similar is needed by people and software agents for information on the semantic web

TRANSCRIPT

Page 1: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @

Swoogle Tutorial (Part I: Swoogle R & D)

A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle development

Presented by eBiquity Lab, CSEE, UMBC

Page 2: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

1. Introducti

on Motivation Swoogle in the Semantic Web Glossary Swoogle Architecture

Swoogle

Page 3: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Motivation

(Google + Web) has made us all smarter something similar is needed by people and software

agents for information on the semantic web

Page 4: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

The Role of Swoogle in Semantic Web

Semantic WebServices

Semantic web data

Software Agents, Applications

SW data service

database(Web) document

RDF document

usesuses

Directory/Digest Service

Service Finder

digestsdigests

searches

Data Finder Swoogle

Page 5: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Concepts Explained

wordNet:Agent

rdf:typerdfs:Class

rdfs:subClassOf

foaf:Person

http://xmlns.com/foaf/1.0/

foaf:mbox

rdfs:domain

rdf:typerdf:Property

Property

Class

SWOhttp://foo.com/foaf.rdf#finin

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://foo.com/foaf.rdf#finin

SWI

Individual

SWD

Term

NOTE: Qualified Names (QName) are used to shorten well-known namespaces as follows

rdf: => http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdfs: => http://www.w3.org/2000/01/rdf-schema foaf: => http://xmlns.com/foaf/1.0/wordNet: => http://xmlns.com/wordnet/1.6/

Page 6: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Glossary Document

A Semantic Web Document (SWD) is an online document written in semantic web languages (i.e. RDF and OWL).

An ontology document (SWO) is a SWD that contains mostly term definition (i.e. classes and properties). It corresponds to T-Box in Description Logic.

An instance document (SWI or SWDB) is a SWD that contains mostly class individuals. It corresponds to A-Box in Description Logic.

Term A term is a non-anonymous RDF resource which is the URI reference of

either a class or a property.

Individual An individual refers to a non-anonymous RDF resource which is the URI

reference of a class member.

In swoogle, a document D is a valid SWD iff. JENA* correctly parses D and produces at least one triple.

*JENA is a Java framework for writing Semantic Web applications. http://www.hpl.hp.com/semweb/jena2.htm

rdf:typerdfs:Class

foaf:Person

rdf:typefoaf:Person

http://.../foaf.rdf#finin

Page 7: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Swoogle Architecture

metadata creation

data analysis

interface

SWD discovery

SWD MetadataWeb Service

Web Server

SWD Cache

The WebCandidate

URLs Web Crawler

SWD Reader

IR analyzer SWD analyzer

Agent Service

Page 8: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

2. Swoogle Research

Discovery Digest Search & Navigation Rank Statistics

Swoogle

Page 9: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Discovery - research Discovering URLs of possible SWD

automatically Google-crawler Focused-crawler Semantic-Web-crawler, e.g. scutter

Revisiting URLs

Page 10: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Discovery -- results Crawler performance

Google crawler is the best Focused crawler needs to be improved

Verified pure SWDs are only 1/3 of discovered URLs Some NSWDs contains embedded RDF graph.

  SWD NSWD Undecided TOTAL

Focused Crawler 1,465 7% 10,580 52% 8,292 20,337

google crawler 273,023 36% 369,371 49% 110,794 753,188

swd_crawler 61,870 15% 285,506 70% 57,709 405,085

TOTAL 336,358   665,457   176,795 1,178,610

Source: Swoogle (2005-Jan-05) SELECT `discovered_by`, sum(isRDF), sum(1-isRDF), count(*) FROM `digest_url` WHERE 1 group by discovered_by

Page 11: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Digest -- research Document metadata

Annotative General metadata SWD metadata Ontology metadata

Inter-document relations Document-term relations

Term metadata Term Definition Inter-term Relation

Class-property bond (C-P bond): rdfs:domain Property-Class bond (P-C bond): rdfs:range

Page 12: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Document Metadata Web document metadata

When/how discovered/fetched Suffix of URL Last modified time Document size

SWD metadata Language features

OWL species RDF encoding

Statistical features # of Defined/used terms # of Declared/used namespaces Ontology Ratio

Ontology Rank

Ontology annotation Label Version Comment

Relations Links to other SWDs

Imported SWDs Referenced SWDs Extended SWDs Prior version

Links to terms Classes/properties defined Classes/properties used

Page 13: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Digest “Time” Ontology (document view)

Demo2(a)

Page 14: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Document-Term Relation

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://www.cs.umbc.edu/~finin/foaf.rdf

wordNet:Agent

rdf:typerdfs:Class

rdfs:subClassOf

foaf:Person

http://xmlns.com/foaf/1.0/

foaf:mbox

rdfs:domain

rdf:typerdf:Property

populated Class

defined Class

populated Property

defined Property

http://foo.com/foaf.rdf#finin

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://foo.com/foaf.rdf

defined Individual

Page 15: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Digest “Time” Ontology (term view)

Demo2(b)

………….

Page 16: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Term MetadataTerm Definition• rdfs:subClassOf -- foaf:Agent• rdfs:label – “Person”

C-P bond (from SWI)• foaf:name• dc:title

C-P bond (from SWO)• foaf:mbox• foaf:name

foaf:name

foaf:mbox

rdfs:domain

rdfs:domain

Onto 1

owl:Classrdf:type

“Person”rdfs:label

foaf:Agentrdfs:subClassOf

Onto 2

foaf:name

rdf:type

“Tim Finin”

SWD3

foaf:Person

Page 17: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Digest Term “Person”Demo4

Page 18: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Term Distribution (grouped by local name)case-insensitive case-sensitive

Name 656 1 name 560 11 source 129

Person 399 2 Person 357 12 email 125

Title 349 3 title 292 13 Book 124

Location 334 4 description 242 14 address 121

Description 288 5 location 213 15 Event 117

Date 257 6 type 196 16 Location 114

Type 242 7 date 173 17 author 111

country 236 8 value 154 18 Animal 111

Address 212 9 Organization 134 19 Country 104

organization 186 10 country 130 20 language 103

   

total 72502 total 76827

Page 19: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Digest -- result

type Pop. Def. # termTotalTerms # populated

Totalpopulated

class 0 1 83,602 88%   0 0%  

1 0 3,954 4%   1,002,961 13%  

1 1 7,065 7% 94,621 6,483,485 87% 7,486,446

property 0 1 42,853 73%   0 0%  

1 0 8,312 14%   2,438,455 6%  

1 1 7,836 13% 59,001 36,899,842 94% 39,338,297

Ontological Term Distribution (populated, defined)

Source: Swoogle (2005-Jan-05) SELECT res_type,sign(cnt_instance_populate>0), sign(cnt_swd_def>0),count(*), sum(cnt_instance_populate) FROM `digest_term` WHERE 1 group by res_type, sign(cnt_instance_populate>0), sign(cnt_swd_def>0)

Page 20: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Search & Navigation -- researchThe Semantic Web is not the Web

Search service Document search – RDF document is not free text Term search – URIref and compound local name

Navigation service The RDF graph – Typed links The web of RDF documents – Few hyperlinks The social network of agents – trust & provenance

Page 21: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Find “Time” OntologyWe can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology.

Demo1

Page 22: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Find Term “Person”Demo3

Not capitalized! URIref is case sensitive!

Page 23: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Current Swoogle Navigation Model A URIref refers to

A term, i.e. instance of RDFS class/property

An individual, i.e. populated terms A SWD could be

SWO: term definition SWI: individuals

Observations RDF Resources are semantically

linked in RDF graph SWDs are poorly linked due to the

absence of explicit hyperlink concept

Ontologies are more interesting Approach

Build inter-document relations Rational surfing model

SWOs

SWIs

HTMLdocuments

Images

Audiofiles

Videofiles

Page 24: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

URL

URIref

Semantic Web Navigation Model new!

Resource

RDF Document

populatesClasspopulatesPropertyrefersClassrefersProperty

definesClassdefinesProperty

rdfsOntologyowldlOntology

owl:importsowl:priorVersionowl:backwardCompatibleWithowl:imcompatiableWith

rdfs:seeAlsordfs:isDefinedBy

Ontology

Namespace

isDefinedByisUsedBy

usesNamespace

rdfs:subClassOf

sameNamespacesameLocalname

RDF Graph Navigation …Term Search

Document Search

Page 25: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Ranking -- research Surfing models

Ranking method PageRank variation

What to rank

Scope Idea

Rational surfing model SWD Semantic Web Summarize inter-document relation as EX, TM, IM, PV

Plain Graph Model Resource RDF graph RDF graph is browsed as a weighted directed graph

RDFS-based Model Resource RDF graph RDF graph is browsed only with RDFS semantics

SW navigation model Resource& SWD

Semantic Web Assume Swoogle is used in navigation

Page 26: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Ranking with Rational Surfing Model: An Example

foaf:mbox

rdf:type

[email protected]

foaf:Person

http://www.cs.umbc.edu/~finin/foaf.rdfwordNet:Person

rdf:type rdfs:Class

rdfs:subClassOf

foaf:Person

http://xmlns.com/foaf/1.0/

TM

TM

TM

http://www.w3.org/2000/01/rdf-schema

rdfs:subClassOf

rdf:Property

rdf:type

http://xmlns.com/wordnet/1.6/

rdfs:Classrdf:type

wordNet:Individualrdfs:subClassOf

wordNet:Person

EX

Page 27: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Demo6 Swoogle’ top

10

This report is dynamically generated based on the latest data, and it will take 5 to 10 seconds.

Swoogle use PageRank like algorithm to rank semantic web documents. Well-known ontologies are highly ranked.

Page 28: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Statistics – research Summarize the dataset collected by Swoogle

Swoogle Watch Swoogle Today Distribution of visited URLs Document discovery log Term discovery log

Semantic Web Watch SWD distribution by last-modified month SWD distribution by website SWD distribution by suffix

Ontology Watch Term (class/property) usage Namespace usage

Page 29: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Demo5(a) Swoogle

Today

Page 30: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Demo5(b) Swoogle

Statistics

FOAF

Trustix

W3C

Stanford

Page 31: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

Demo5(c) Swoogle

Statistics

Page 32: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Miscellaneous Submit URL for focused Crawler Swoogle Web Service (Delivered in Sept.)

http://swoogle.umbc.edu/webservice/ Search document Search term Term digest

Page 33: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

When you can’t find your ontologies in Swoogle, it may be the case that your ontologies are not indexed by swoogle yet.

Please submit it and increase its visibility.

From site map

When your query fails

Demo7 Submit URL for focused crawler

Page 34: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

3. Summar

y Summary Current Status

Swoogle

Page 35: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

SummarySwoogle (Mar, 2004)

Swoogle2 (Sep, 2004)

Swoogle3

Automated SWD discovery SWD metadata creation and search Ontology rank (rational surfer model) Swoogle watch Web Interface

Ontology dictionary Swoogle statistics Web service interface (WSDL) Bag of URIref IR search

Better discovery & revisit strategies Better navigation models Semantic web dataset Index Instance data More metadata (ontology mapping) Better web service interfaces

2005

2004

Page 36: @ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R  D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle

eBiquity Lab, CSEE, UMBC @Swoogle

Current Status Swoogle Watch reported (Jan 6, 2005)

46.7 M triples 336 K SWDs: 4k ontologies 153 K terms: 94K classes & 59K properties

Ongoing work Research

Self-adaptive SWD Discovery Efficient SWD digest and RDF Graph Abstract Semantic Web navigation model

Engineering Enhancing Web Service interface