stanford db seminar, october 20, 2000 web, semantics, oil and fuel: semantic interoperability and...
Post on 19-Dec-2015
213 Views
Preview:
TRANSCRIPT
Stanford DB Seminar, October 20, 2000
Web, Semantics, OIL and FUEL: Semantic Interoperability and learning on the Web
byAmit Sheth
Director, Large-Scale Distributed Information Systems Lab.
University of Georgia, Athens, GA USA
http://lsdis.cs.uga.edu
Founder/Chairman, Taalee, Inc.
http://www.taalee.com
Special thanks, Digital Library project team at LSDIS
Semantics: “meaning or relationship of meanings, or
relating to meaning …” (Webster), meaning and use of data
(Information System)
Semantic Web: “The Web of data (and connections) with
meaning in the sense that a computer program can learn
enough about what the data means to process it. . . .
. . . Imagine what computers can understand when there is
a vast tangle of interconnected terms and data that can
automatically be followed.” (Tim Berners-Lee, Weaving the Web, 1999)
• “A Web in which machine reasoning will be ubiquitous and devastatingly powerful.”
• “A place where the whim of a human being and the reasoning of a machine coexist in an ideal, powerful
mixture.”
• “A semantic Web would permit more accurate and efficient Web searches, which are among the most important Web-based activities.”
— A personal definition Semantic Web: The concept that Web-accessible content can be organized semantically, rather than though syntactic and structural methods.
• Markups/Standards: DAML: Semantic Annotations and Directory; DSML: Directory(of course, XML, RDF, namespaces)
• Commercialization 1 (Oingo): Taxonomy – Ontology and Semantic Techniques
• Commercialization 2 (Taalee): Knowledge-base (Taxonomy, Domain Modeling, Entities and Relationships) and Semantic Techniques
• Research (Digital Earth at UGA): Complex Relationships and “deep semantics”
1. Create an Agent Mark-Up Language (DAML) built upon XML that allows users to provide machine-readable semantic annotations for specific communities of interest.
2. Create tools that embed DAML markup on to web pages and other information sources in a manner that is transparent and beneficial to the users.
3. Use these tools to build up, instantiate, operate, and test sets of agent-based programs that markup and use DAML.
4. 5. 6. ….applications
allow semantic interoperability at the level we currently have syntactic interoperability in XML
DARPA Agent Mark Up Language (DAML)Program Manager: Professor James Hendler http://dtsn.darpa.mil/iso/programtemp.asp?mode=347
<Title> DAML
<subtitle> an Example </subtitle> </title>
<USE-ONTOLOGY ID=”PPT-ontology" VERSION="1.0" PREFIX=”PP" URL= "http://iwp.darpa.mil/ppt..html">
<CATEGORY NAME=”pp.presentation” FOR="http://iwp.darpa.mil/jhendler/agents.html">
<RELATION-VALUE POS1 = “Agents” POS2 = “/madhan”>
<ONTOLOGY ID=”powerpoint-ontology" VERSION="1.0" DESCRIPTION=”formal model for powerpoint presentations">
<DEF-CATEGORY NAME=”Title" ISA=”Pres-Feature" > <DEF-CATEGORY NAME=”Subtitle" ISA=”Pres-Feature" >
<DEF-RELATION NAME=”title-of" SHORT="was written by"> <DEF-ARG POS=1 TYPE=”presentation"> <DEF-ARG POS=2 TYPE=”presenter" >
Source : http://www.darpa.mil/iso/DAML/
Objects in the web can be marked- in principle - (manually or automatically) to include the following information
• Descriptions of data they contain (DBs)
• Descriptions of functions they provide (Code)
• Descriptions of data they can provide (Sensors)
Example of searching on DAML-centric semantic WebExample of searching on DAML-centric semantic Web
Sou
rce
: ht
tp:/
/ww
w.z
dne
t.co
m/p
cwee
k/st
orie
s/ju
mps
/0,4
270,
2432
946
,00.
htm
l
Value of Information
Directory; Structure; Table of Contents
Tar
get
ing
Search; Syntax; Index
Semantics results in deep understanding of content, resulting in more relevant and timely match with the
information needs and targeting.
Semantics; Entity+Rel+Events;Meaning with Context
• Oingo Ontology – ODP based(?), the database of millions of concepts and relationships that powers Oingo's semantic technology
• Oingo Seek - the database of millions of concepts and relationships that powers Oingo's semantic technology
• Oingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and context
• Oingo Lingua - the language of meaning used to state intent. The basis for intelligent interaction
• Assets catalogued are Web sites or Web pages.
Taalee WorldModelTM: Domain Models (metadata of domain-media-business attributes, types), Ontologies, Entities, Relationships, Automated “Experts”, Reference Data (Live Encyclopedia), Mappings
Taalee Distributed Intelligent Agent Infrastructure:push/pull/scheduled agents for fresh extraction
Taalee Metabase of A/V assets
Taalee Semantic EngineTM with contextual reasoning
Taalee Semantic Engine
WorldModelTM
Extractor Agents
WorldModel: Understanding of content, profiles, targeting needs
Automatic Extraction Agents: Expert driven value addition
Metabase
Metabase: Rapidly growing A/V aggregation
SemanticPersonalization
Semantic Cataloging
SemanticSearch
SemanticTargeting
SemanticDirectory
Semantic CategorIzation
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating. URL: http://cbs.sportsline...
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw. URL: http://cbs.sportsline...
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31, Pit 24
http://www.nfl.com
Quandry Ismail and Tony Banks hook up for their third long touchdown, this time on a 76-yarder to extend the Raven’s lead to 31-24 in the third quarter.
ProfessionalRavens, SteelersBal 31, Pit 24Quandry Ismail, Tony BanksTouchdownNFL.com2/02/2000
League:Teams:Score:
Players:Event:
Produced by:Posted date:
Simply the most precise and freshest A/V search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources, Can be sorted by any field
Delightful, relevant information,exceptional targeting opportunity
Looking aheadLooking ahead
TO:
Information requests
Content search
Semantic retrieval
Interpretation
Knowledge creation
Knowledge sharing
FROM:
Browsing
Lexical search
Data exchange
Data retrieval
MermaidMermaidDDTSDDTS
Multibase, MRDSM, ADDS, Multibase, MRDSM, ADDS, IISS, Omnibase, ...IISS, Omnibase, ...
Generation IGeneration I
1980s1980s
Evolving targets and approaches in integratingdata and information (a personal perspective)
DL-II/DARPA/KA2 projects,DL-II/DARPA/KA2 projects,OntoBroker, …OntoBroker, …
Taalee, ObserverTaalee, ObserverADEPT, InfoQuiltADEPT, InfoQuilt
Generation IIIGeneration III
1997...1997...
InfoSleuth, KMed, DL-I projectsInfoSleuth, KMed, DL-I projectsInfoscopes, HERMES, SIMS, Infoscopes, HERMES, SIMS,
Garlic,TSIMMIS,Harvest, RUFUS,...Garlic,TSIMMIS,Harvest, RUFUS,...
Generation IIGeneration II
1990s1990s
VisualHarnessVisualHarnessInfoHarnessInfoHarness
Terminology (and language) transparency
Domain modeling (entities with domain specific
attributes) and complex relationships
Comprehensive metadata management
Context-sensitive information processing
Semantic correlation
enablers of the emerging concepts
Digital Earth Prototype System at UGA
Develop a Digital Earth Modeling SystemDigital Earth Modeling System
Answer requests for collection ofinformation from distributed resources
Develop a supportive learning environment for undergraduate geography students
A Digital Library Scenario VOLCANOES ACTIVITY
Some volcanoes are more active than others, and a few
are in a state of permanent eruption, at least for the
geological present. Volcanoes may become quiescent
(dormant) for months or years. The danger to life posed by
active volcanoes is not limited to eruption of molten rock or
showers of ash and cinders.
Mudflows that melt ice and
snow on the volcano's flanks
are equally hazardous*.
* Encarta® 98 Desk Encyclopedia © & 1996-97 Microsoft Corporation.All rights reserved. Pu'u'O'o, Hawaii
A sample information request:
Find information on volcanoesvolcanoes in St. HelensSt. Helens and how
they affectaffect the environmentenvironment.
Some of the ontologies involved in processing this information request are:
• Ontology for GIS Datasets;
• Ontology for Natural Disasters;
• Ontology for Volcanoes;
• Ontology for Environment;
TRY HERE THIS AND OTHER CONCEPT DEMOS
A Digital Library Scenario VOLCANOES ACTIVITY
““An iscape is an information request that An iscape is an information request that
supports learning and semantic supports learning and semantic
interoperability (about Digital Earth) “interoperability (about Digital Earth) “
(ADEPT at UGA)(ADEPT at UGA)
Iscape working definition
Iscapes are useful to understand geographical phenomena, typically involving relationshipsbetween them
Iscapes are created by instructors usingan iscape specification framework
Iscapes are run by students while learningabout Digital Earth
Iscapes creation framework fits in theADEPT agent -based architecture prototype
Iscapes in the context of digital earth (ADEPT)
Iscape specification framework
InformationLandscape
Ontologies
Relationships
Learning/What-if
Operations/Simulation
Presentation
Creation
Information Landscapes
A modular specification framework to represent information landscapes Specifications of complex information requests
over multiple ontologiesmultiple ontologies
Specification of relationships, relationships, including “affects”including “affects”
Enabling user-configurable parametersparameters
Enabling operationsoperations including simulations simulations
A graphical toolkit for easy creation of iscapes
Information Landscapes
Learning paradigm for students Uses embedded ontological terms and iscapes
Metadata framework Models spatial, temporal and theme based
metadata
Uses FGDC and Dublin Core standards to represent domain independent metadata
Relations
Given a set X, a relation is some property that
may or may not hold between one member of
X and a member of another set
Various relationships:
“equals”, “less_than”, “is_a”, “is_part_of”, “like”
Semantic Relations
Most of these relations are hierarchical or similarity based
These are not powerful enough for our task of semantic interoperability between domains like Geography
In these domains, we have a natural “affects” relation between the ontologies
Semantic Relations
How does A affect B?
A, in its entirety or by a set of its components, induces some changes or properties on a set of components of B
Design of “affects”
How do volcanoes affect the environment?
VOLCANO
LOCATIONASH RAIN
PYROCLASTICFLOW
ENVIRON.
LOCATION
PEOPLE
ATMOSPHERE
PLANT
BUILDING
DESTROYS
COOLS
DESTROYS
KILLS
[Area (Pyroclastic Flow) INTERSECT Area (Plant)]
=> [Pyroclastic Flow destroys Plant]
[Size (Ash Particles) < 2] => [Ash Rain cools Atmosphere]
[Pyroclastic Flow destroys Plant] and [Ash Rain cools Atmosphere]
=>[Volcano affects Environment]
(x | xASC) and (y | yBSC)[ FN(x) operator FN(y) ]* => [ ASC relation BSC ]
[ ASC relation BSC ]* => A affects B
Design of “affects”
Mapping Functions
How do volcanoes affect the environment?
[ Location (Volcano) = Location (Environment) ]
Enclosing function provides a standard interface to the operator
Operator does imprecise or fuzzy match
Achieves Geo-spatial interoperability
Mapping Functions
How do volcanoes affect the environment?
[ Time (Volcano) = Time (Environment) ]
Matches, with a tolerance depending on the granularity of values
Tolerance different for different entities; Specified default; Can be user-defined
Achieves temporal interoperability
Operations
Powerful mechanism of studying geographical domains and other complex phenomena Input parameters can be changed to support learning For e.g. statistical operations, numerical analysis simulation modeling, etc.
Clarke’s Urban Growth Model (UGM)
Demonstrates the utility of integrating existing historic maps
with remotely sensed data and related geographic information
to dynamically map urban land characteristics for large
metropolitan areas.
San Francisco Bay Area prediction of urban extent in 2100
Domain of Learning – URBAN DYNAMICS
Digital Earth Prototype: run-time architecture overview
RELATE
CorrelationAgent
PlanningAgent
User Agent
WrappedResource
Agent
OntologyAgent
Broker
CostModel
Web Wrapper
SimulationDatabaseWrapper
ADEPTMetabase
MetabaseResource
Agent
SimulationResource
Agent
RELATE
CorrelationAgent
PlanningAgent
User Agent
WrappedResource
Agent
OntologyAgent
Broker
CostModel
Web Wrapper
SimulationDatabaseWrapper
ADEPTMetabase
MetabaseResource
Agent
SimulationResource
Agent
Semantic Web: Possible Evolution
HTML XML
XHTML SMIL RDF
Declarative Languages
DAML-O, OIL
FUEL – User defined/supplied operators, functions, computations
OIL,FUEL
FUEL as OIL Extension?
• class-def• subclass-of• slot-def• subslot-of• domain• range
• class-def• subclass-of• slot-def• subslot-of• domain• range
• class-expressions
• AND, OR, NOT
• slot-constraints
• has-value, value-type• cardinality
• slot-properties• trans, symm
• class-expressions
• AND, OR, NOT
• slot-constraints
• has-value, value-type• cardinality
• slot-properties• trans, symm
RDF(S) FUEL • Framework for mapping data/formats• user defined operators eg., affects, simulations
• Framework for mapping data/formats• user defined operators eg., affects, simulations
OIL
Semantic Web can be a basis of handling information
overload and provide semantic interoperability
Step wise enrichment -- starting with constrained and
well understood language (such as based on Description
Logic), let us explore how we can support richer/deeper
semantics for enabling complex decision making and
learning involving heterogeneous digital media on the
Global Information Infrastructure
The Promise of the Web with Semantics….
“Humankind has not woven the web of life.We are but one thread within it.Whatever we do to the web, we do to ourselves.All things connect.”– Chief Seattle, 1854
amit@taalee.com – http://www.taalee.comamit@cs.uga.edu – http://lsdis.cs.uga.edu
Further reading http://www.semanticweb.org http://www.daml.org http://lsdis.cs.uga.edu/~adept “DAML could take search to a new level” http://www.zdnet.com/pcweek/stories/news/0,4153,2432538,00.html V. Kashyap and A. Sheth, Information Brokering, Kluwer Academic Publishers, 2000
Tim Berners-Lee, Weaving the Web, Harper, 1999.
Editorial writing by Ramesh Jain in IEEE Multimedia. Gio’s papers. OIL ….
For additional details on Information Brokering Architecture:Realizing Semantic Information Brokering and Semantic Web ITC-IRST/University of Trento Seminar Series on Perspectives on Agents: Theories and Technologies, April, 27, 2000, Trento, Italy http://lsdis.cs.uga.edu/~adept/presenta.html
For additional details on ISCAPE specification and Execution:Project Overview and Detailed Presentation at:
http://lsdis.cs.uga.edu/~adept/presenta.html
Demonstrations at: http://lsdis.cs.uga.edu/~adept
<! -- A template collection for all iscapes -- >
<?xml version = “1.0” ?>
<!DOCYPE IscapeCollection SYSTEM “IscapeCollection.dtd” >
<! -- All Iscapes -- >
<IscapeCollection>
<!-- An iscape specification for how stratovolcanoes affect the environment -- >
<Iscape>
< -- Identifying this iscape -- >
<Name> How do stratovolcanoes affect the environment </Name>
<Description> An iscape using the affects relationship </Description>
<! – All ontologies which participate -- >
<Ontologies>
<Ontology>Volcano</Ontology>
<Ontology>Environment</Ontology>
</Ontologies>
<! – Operations involved -- >
<Operation>
<Relation>Affects</Relation>
</Operation>
Iscape specification using XML
Iscape specification using XML <!— Constraints on ontologies -- >
<Ontological Constraints>
<Constraint> Volcano morphology is stratovolcano </Constraint>
<Constraint> Volcano start year is 1950 </Constraint>
</Ontological Constraints>
<!—Metadata to present in the result -->
<Presentation> Volcano and Environment Metadata </Presentation>
<!—What can the student configure -- >
<Student>
<Config> Location of Environment </Config>
</Student>
</Iscape>
<!—This Iscape Ends -- >
<! – Next Iscape starts -- >
<Iscape>
…
…
</Iscape>
</IscapeCollection>
<!—Iscape Collection ends here -- >
Relations <!-- Template collection of all relations in the system -->
<?xml version = “1.0” >
<!DOCTYPE Relations SYSTEM “Relations.dtd” >
<Relations>
<!--Relation specification starts here -->
<Relation>
<!-- Information to correlate with base iscape -->
<Name> Affects </Name>
<!-- Ontologies Involved -->
<OntologyA> Volcano </OntologyA>
<OntologyB> Environment </OntologyB>
<!-- All operators -->
<OperatorSet>
<!-- Specification has value and mapping conditions -->
<ValueCondition>
<OntologyName> Environment </OntologyName>
<Attribute> Damage </Attribute>
<ValOperator> GREATERTHANEQUALS</ValOperator>
<Value> 10000 </Value>
<Type> Integer </Type>
</ValueCondition>
Relations
<MappingCondition>
<FunctionA>Area</FunctionA>
<ElementA>Volcano</FunctionA>
<Operator>EQUALS</Operator>
<FunctionB>Area</Function>
<ElementB>Environment</ElementB>
</MappingCondition>
</OperatorSet>
<!-- End of all operators -- >
</Relation>
<!-- End of this relation specification -- >
</Relations>
<!-- End of relation collection -- >
Ontological Constraints <!-- Template to specify ontological constraints -- >
<?xml version = “1.0” >
<!DOCTYPE OntologicalConstraints SYSTEM “OntologicalConstraints.dtd” >
<!-- A collection of ontological constraints for all iscapes -- >
<OntologicalConstraints>
< -- A constraint on this iscape-->
<Constraint>
<IscapeID>Volcano-Env</IscapeID>
<Name>Volcano morphology is stratovolcano</Name>
<LHSOntology>Volcano</LHSOntology>
<LHSAttribute>Morphology</LHSAttribute>
<Operator>LIKE</Operator>
<Type>String</Type>
<RHSValue>Stratovolcano</RHSValue>
</Constraint>
</OntologicalConstraints>
<! -- Collection of ontological constraints ends here -- >
Presentation <!-- Template for presentation attributes - ><?xml version = “1.0” ><!DOCTYPE Presentation SYSTEM “Presentation.dtd” ><!-- All presentation attributes are embedded here - ><Presentation> <!-- presentation attributes for this iscape-- ><IncludeThese>
<IscapeID>Volcano-Env</IscapeID><Name>Volcano and Environment Metadata</Name><Include>
<Ontology>Volcano</Ontology><Attribute>TectonicSetting</Attribute>
</Include><Include>
<Ontology>Volcano</Ontology><Attribute>EndYear</Attribute>
</Include></IncludeThese></Presentation><!-- Presentation attributes end here -- >
Student < !-- Template for student configurable attributes -- ><! DOCTYPE Student SYSTEM “Student.dtd” ><!-- All parameters which can be configured by a student -- ><UserConfigurable><!-- Configuration for a particular iscape -- ><Config>
<!-- Correlating information -- ><Name>Location of environment</Name><!-- The parameters which are configurable -- ><Parameter>
<Ontology>Environment</Ontology><Attribute>LocationName</Attribute><DisplayName>Configure Location</Display><Value>Hawaii</Value><Value>Kileauaea</Value>
</Parameter></Config><!-- Configuration for this iscape ends here -- ></UserConfigurable><!-- End of all student configurable parameters -- >
Receives the results collections from each of the resource agents
Correlates the results on basis of information provided in iscape and the query plan generated by planning agent
Performs data cleaning operations and merges the results into uniform result set and pass it on to user agent
Responsible for performing operations, if specified in the iscape
The correlation agent
Realizing Semantic Information Brokeringand Semantic Web in summary
TextTextStructured DatabasesStructured Databases DataData Syntax,Syntax,
SystemSystem Federated DBFederated DB
Semi-structuredSemi-structured MetadataMetadata Structural,Structural,SchematicSchematic
Mediator,Mediator,Federated ISFederated IS
Visual,Visual,Scientific/Eng.Scientific/Eng. KnowledgeKnowledge SemanticSemantic
Knowledge Mgmt.,Knowledge Mgmt.,InformationInformationBrokering/Brokering/Mediator,Mediator,
Cooperative ISCooperative IS
Popular Alternative perspective/approach: Linguistics, IR, AI
Graduate students in a College of Geography have a final
project in which a case of study is proposed. In the case,
they are supposed to help a City Council in making
decisions over the planning of a new landfill. This is a
hands-on learning exercise through the interaction
with a Digital EarthDigital Earth and the starting
point would be to find the best
location for the landfill*.
Tacoma Landfill
* This scenario comes in support of one of the suggestions for
Digital Earth scenarios sampled by the “First Inter-Agency Digital
Earth Working Group, an effort on behalf of NASA’s inter-agency
Digital Earth Program.
Taking advantage of the Web for learning
bydefinition
bysemantics by
synonymy A first cut refinement leads us to the following information request:
FindFind a proper soil in sites not subject to flooding or high a proper soil in sites not subject to flooding or high
groundwater levelsgroundwater levels for a new landfill nearnear thethe industrial zone industrial zone.
Liquefaction phenomenon cannot occurLiquefaction phenomenon cannot occur.
Find a landfill sitelandfill site for a new landfill near the source of the wastessource of the wastes.
The earthquakes’ impacts must be evaluatedThe earthquakes’ impacts must be evaluated..
A high level information request would be:
An example scenario of learning on the Web
Adding on-the-fly user constraints while processing the information request:
Retrieve satellite images in 12-meter resolution or higher,Retrieve satellite images in 12-meter resolution or higher,
looking for soils with permeability rate < 10 looking for soils with permeability rate < 10 (silty clay loam)
for a new landfill
whose distance from the city industrial park is less than 5km.whose distance from the city industrial park is less than 5km.
Using the images’ coordinates, forecast seismic activity up to Using the images’ coordinates, forecast seismic activity up to
moderate magnitude moderate magnitude (5 - 5.9, Richter scale) in the pointed areas. in the pointed areas.
domain specific metadata; correlation among multiple ontologies; return results in multiple media (in this case, images and a simulation)
An example scenario of learning on the Web
Partial sample ontologies for semantic information brokering:
LANDUSE
COMERCIAL
INDUSTRIAL
RURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC.
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
An example scenario of learning on the Web
A sample result (depending on information providers) could be:
OrbView-4’s stereo imaging capacity providing 3-D terrain images
Hyperspectral data will be valuable for identifying material types
images source: http://www.orbimage.com
5km
industrial zone
identified landfill site
The students now have the information requested for
helping the City Council in the planning of the new landfill
An example scenario of learning on the Web
top related