cse-291: ontologies in data integration department of computer science & engineering university...

71
CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering Department of Computer Science & Engineering University of California, San Diego University of California, San Diego CSE-291: Ontologies in Data CSE-291: Ontologies in Data Integration Integration Spring 2003 Spring 2003 Bertram Lud Bertram Lud ä ä scher scher [email protected] [email protected]

Upload: jasmine-bell

Post on 12-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Department of Computer Science & Engineering Department of Computer Science & Engineering University of California, San DiegoUniversity of California, San Diego

CSE-291: Ontologies in Data IntegrationCSE-291: Ontologies in Data IntegrationSpring 2003Spring 2003

Bertram LudBertram Ludää[email protected]@SDSC.EDU

Page 2: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

OutlineOutline

• Wrapping up last weekWrapping up last week• What is a representation?What is a representation?• [Thesauri, Topic Maps][Thesauri, Topic Maps]• Predicate Logic PrimerPredicate Logic Primer• Description logicsDescription logics• [RDF & RDF Schema][RDF & RDF Schema]• [F-logic][F-logic]• Topic SelectionTopic SelectionSpecial thanksSpecial thanks: : • Alexander Maedche, Steffen Staab:Alexander Maedche, Steffen Staab:

– ECAI’2002 Tutorial on Ontologies

Page 3: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontologies … For What?Ontologies … For What?

• Lack of a Lack of a shared understandingshared understanding leads to poor leads to poor communicationcommunication

=> People, organizations and software systems

must communicate between and amongthemselves

• Disparate modeling paradigms, languages and software Disparate modeling paradigms, languages and software tools limittools limit

=> Interoperability

=> Knowledge => Knowledge sharingsharing & & reusereuse [Uschold, Gruninger, 96]

Page 4: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Origin and History (I)Origin and History (I)

• Ontology ....Ontology .... a philosophical discipline, branch of philosophy that deals with the nature and the organisation of reality

• Science of Being (Aristotle, Metaphysics, IV, 1)Science of Being (Aristotle, Metaphysics, IV, 1)

• Tries to answer the questions:Tries to answer the questions:

What is being?

What are the features common to all beings?

Page 5: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Origin and History (II)Origin and History (II)

• Humans require words (or at least symbols) to communicate efficiently. The mapping of words to things is only indirect possible. We do it by creating concepts that refer to things.

• The relation between symbols and things has been described in the form of the meaning triangle:

“Jaguar“

Concept

[Ogden, Richards, 1923]

Page 6: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Origin and History (III)Origin and History (III)

• In recent years ontologies have become a hot topic of interest.

• Here, an ontology refers to an engineering artifact: • It is constituted by a specific vocabulary used to describe a

certain reality, plus • a set of explicit assumptions regarding the intended meaning

of the vocabulary.

• Thus, ontologies describe a formal partial specification of a specific domain:• Shared understanding of a domain of interest• Formal and machine executeable model of a domain of interest

Page 7: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Human and machine communication (I)Human and machine communication (I)

• ... MachineAgent 1

Things

HumanAgent 2

Ontology Description

MachineAgent 2

exchange symbol,e.g. via nat. language

‘‘JAGUAR“

Internalmodels

Concept

Formalmodels

exchange symbol,e.g. via protocols

MA1HA1 HA2

MA2

Symbol

commit commit

a specific domain, e.g.animals

commitcommitOntology

Formal Semantics

HumanAgent 1

MeaningTriangle

[Maedche et al., 2002]

Page 8: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontology & Natural LanguageOntology & Natural Language

• It is important to emphasize that there is a m:n relationship It is important to emphasize that there is a m:n relationship between words and conceptsbetween words and concepts

• This means practically:This means practically:

– different words may refer to the same concept

– a word may refer to several concepts

• Ontologies languages should provide means for making this Ontologies languages should provide means for making this difference explicit. difference explicit.

Page 9: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ExampleExample

Ontology: C = {c1,c2, c3}, R = {r1}, HC(c2,c1), r1(c2,c3),

c3

c1

...

c2

..

....r1(c2,c3),

HC(c2,c1)person

employee

organisation

works at

Lexicon: LC = {person, employee, organisation}, LR = {works at}

F(person) = c1, F(employee) = c2, F(organisation) = c3,

G(works at) = r1

Page 10: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontology vs. Knowledge BasesOntology vs. Knowledge Bases• There is no clear separation between ontology and knowledge There is no clear separation between ontology and knowledge

basebase

• Example:Example:

• Often it remains a modeling decision if something is modeled Often it remains a modeling decision if something is modeled as concept or as instance. In many applications meta-modeling as concept or as instance. In many applications meta-modeling means are required.means are required.

person

Ann

medication Aspirin

Aspirin pill-1 pill-2

cured-with

taken-aspirins

taken-aspirins

Page 11: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Types of Ontologies (I)Types of Ontologies (I) [Guarino, 98]

describe very general concepts like space, time, event, which are independent of a particular problem or domain. It seems reasonable to have unified top-level ontologies for large

communities of users.

describe the vocabulary related

to a generic domain by specializing the

concepts introduced in the top-level

ontology.

describe the vocabulary related to a generic task

or activity by specializing the

top-level ontologies.

These are the most specific ontologies. Concepts in application ontologies often correspond to roles played by domain entities while performing a certain activity.

Page 12: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontologies and their Relatives (I)Ontologies and their Relatives (I)

• There are many relatives around:There are many relatives around:

– Controlled vocabularies, thesauri and classification systems available in the WWW, see http://www.lub.lu.se/metadata/subject-help.html

• Classification Systems (e.g. UNSPSC, Library Science, etc.)• Thesauri (e.g. Art & Architecture, Agrovoc, etc.)

– Lexical Semantic Nets • WordNet, see http://www.cogsci.princeton.edu/~wn/• EuroWordNet, see http://www.hum.uva.nl/~ewn/

– Topic Maps, http://www.topicmaps.org (e.g. used within knowledge management applications)

• In general it is difficult to find the border line! In general it is difficult to find the border line!

Page 13: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontologies and their Relatives (II)Ontologies and their Relatives (II)

Catalog / ID

Terms/Glossary

Thesauri

InformalIs-a

FormalIs-a

FormalInstance

Frames

ValueRestric-tions

Generallogical

constraints

AxiomsDisjointInverse Relations,...

Page 14: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Some Ontologies (and Friends) in Some Ontologies (and Friends) in ActionAction

(coming soon to a project near you)(coming soon to a project near you)

Page 15: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

GEON ArchitectureGEON Architecture

Rocky Mountains

Midatlantic Region

Page 16: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

SMART (Meta)data I: Logical Data ViewsSMART (Meta)data I: Logical Data Views

Source: NADAM Team(Boyan Brodaric et al.)

Adoption of a standard (meta)data model => wrap data sets into unified virtual views

Page 17: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

SMART Metadata II: Multihierarchical Rock Classification for “Thematic SMART Metadata II: Multihierarchical Rock Classification for “Thematic Queries” (GSC) –– or: Queries” (GSC) –– or: Taxonomies are not only for biologists ...Taxonomies are not only for biologists ...

Composition

Genesis

Fabric

Texture

“smart discovery & querying” via multiple, independent concept hierarchies (controlled vocabularies)• data at different description levels can be found and processed

Page 18: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Biomedical InformaticsResearch Networkhttp://nbirn.net

Biomedical InformaticsResearch Networkhttp://nbirn.net

SMART Metadata III:SMART Metadata III: Source Source Contextualization & Ontology RefinementContextualization & Ontology Refinement

Focused GEON ontology working meeting last week ... (GEON, SCEC/KR, GSC, ESRI)

Page 19: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

EcoCycEcoCyc

Page 20: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Gene Ontology Gene Ontology http://www.geneontology.orghttp://www.geneontology.org

“a dynamic controlled vocabulary that can be applied to all eukaryotes”

Built by the community for the community.

Three organising principles: Molecular function, Biological

process, Cellular component Isa and Part of taxonomy – but

not good! ~10,000 concepts Lightweight ontology, Poor

semantic rigour. Ok when small and used for annotation. Obstacle when large, evolving and used for mining.

Page 21: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Controlled vocabularyControlled vocabulary

• AGROVOC: Agricultural VocabularyAGROVOC: Agricultural Vocabulary

Page 22: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ThesauriThesauri

• AAT: Art & Architecture ThesaurusAAT: Art & Architecture Thesaurus

Page 23: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontologies - Some ExamplesOntologies - Some Examples

• General purpose ontologies:General purpose ontologies:– WordNet / EuroWordNet, http://www.cogsci.princeton.edu/~wn– The Upper Cyc Ontology, http://www.cyc.com/cyc-2-1/index.html– IEEE Standard Upper Ontology, http://suo.ieee.org/

• Domain and application-specific ontologies:Domain and application-specific ontologies:– RDF Site Summary RSS, http://groups.yahoo.com/group/rss-dev/files/schema.rdf– UMLS, http://www.nlm.nih.gov/research/umls/– KA2 / Science Ontology, http://ontobroker.semanticweb.org/ontos/ka2.html– RETSINA Calendering Agent, http://ilrt.org/discovery/2001/06/schemas/ical-full/hybrid.rdf– AIFB Web Page Ontology, http://ontobroker.semanticweb.org/ontos/aifb.html– Web-KB Ontology, http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/– Dublin Core, http://dublincore.org/

• Meta-OntologiesMeta-Ontologies– Semantic Translation, http://www.ecimf.org/contrib/onto/ST/index.html– RDFT, http://www.cs.vu.nl/~borys/RDFT/0.27/RDFT.rdfs– Evolution Ontology, http://kaon.semanticweb.org/examples/Evolution.rdfs

• Ontologies in a wider senseOntologies in a wider sense– Agrovoc, http://www.fao.org/agrovoc/– Art and Architecture, http://www.getty.edu/research/tools/vocabulary/aat/– UNSPSC, http://eccma.org/unspsc/– DTD standardizations, e.g. HR-XML, http://www.hr-xml.org/

Page 24: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontology RepresentationOntology Representation

What is a „representation“?What is a „representation“?

“Jaguar“

Concept

Page 25: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Ontology Representation LanguagesOntology Representation Languages

• Machines need communication with formal content to Machines need communication with formal content to restrict meaningrestrict meaning

• What makes a language „formal“?What makes a language „formal“?– model theory (1st order predicate logic)

– proof theory (Gentzen calculus)

But also:

– conventions (e.g. Java)

Page 26: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

What makes a language suitable?What makes a language suitable?

For machine communicationFor machine communication

model theory model theory proof theoryproof theory

tracktabilitytracktability

strong conventions of usestrong conventions of use

human readable names human readable names

For human communicationFor human communication

strong conventions of use strong conventions of use human readable names human readable names „ „natural“ primitives natural“ primitives

Page 27: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Representation Paradigms (incomplete)

Ontologies

TopicMaps

extended ER-Modell

Thesauri

Predicate Logics /Description Logics

Semantic Nets

Taxonomies

Page 28: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ThesaurusThesaurus

Page 29: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Thesauri

Example:Fruit

Orange Apfelsine (german)

VegetablesimilarTo

synonymWith

NarrowerTerm

- Well known in library science- cf. terminologies / classifications (Dewey)

- Graph with labels edges (similar, nt, bt, synonym)- Fixed set of edge labels (aka relations)- no instances

Page 30: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Page 31: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Topic Maps are ...Topic Maps are ...

• Standardized: ISO/IEC 13250:2000Standardized: ISO/IEC 13250:2000– ISO standard published Jan. 2000

– enabling standard to describe knowledge structures,electronic indices, classification schemes, ...

• Web enabled:Web enabled:– XML Topic Maps (XTM) are ready to use

• Designed to:Designed to:– manage the info glut

– build valuable information networks above any kind of resources / data objects

– enable the structuring of unstructured information

Page 32: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Page 33: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Topics

Page 34: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Occurrences

Page 35: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Different topic classes

Page 36: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Different occurrences classes

Page 37: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Multiple topic names

Page 38: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Back-of-the-Book Index “British Virgin Back-of-the-Book Index “British Virgin Islands”Islands”

Gorda Sound see North SoundLittle Dix Bay .................... 89North Sound ....................... 90Road Harbour see also Road Town ... 73Road Town ...................... 69,71Spanish Town ................... 81,82Tortola ........................... 67Virgin Gorda ...................... 77

Association

Page 39: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Topics – Computerized SubjectsTopics – Computerized Subjects

SurfBVIBVI Welcome CaribNet

Resources

TopicsLittle Dix Bay Tortola

Road TownVirgin GordaSubject

Subject

SubjectSubject

North Sound

Subject

Road Harbour

Subject

Spanish Town

Subject

Bay Island Town Topic classes

Page 40: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

SurfBVIBVI Welcome CaribNet

OccurrencesOccurrences

Resources

TopicsLittle Dix Bay Tortola

Road TownVirgin Gorda

North Sound

Road Harbour

Spanish Town

Occurrences

OccurrenceclassesImage

Map

Article

MapMap

Map

MapArticle

Article

Article

ArticleArticle

Article

Image Image

Image

Image

Page 41: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

OccurrencesOccurrences

Resources

TopicsLittle Dix Bay Tortola

Road TownVirgin Gorda

North Sound

Road Harbour

Spanish Town

Occurrences

SurfBVIBVI Welcome CaribNet

OccurrenceclassesImage

Map

Article

Page 42: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

AssociationsAssociations

Topics

Little Dix Bay TortolaRoad Town

Virgin Gorda

North Sound

Road Harbour

Spanish Town

Associations

Association classes

Vicinity

Part-Whole

Part-Whole

Geo Containment

Geo Containment

Geo Containment

Geo ContainmentVicinityPart-Whole

Page 43: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

AssociationsAssociations

Topics

Little Dix Bay TortolaRoad Town

Virgin Gorda

North Sound

Road Harbour

Spanish Town

Associations

Association classes

Geo ContainmentVicinityPart-Whole

Page 44: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Class HierarchiesClass Hierarchies

TopicsLittle Dix Bay Tortola

Road TownVirgin Gorda

North Sound

Road Harbour

Spanish Town

Bay Island Town Topic classes

Page 45: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Class HierarchiesClass Hierarchies

TopicsLittle Dix Bay Tortola

Road TownVirgin Gorda

North Sound

Road Harbour

Spanish Town

Bay

Island

Town Super-classes

Bay forswimming

Anchorbay

Land

Capital

Suburb Sub-classes

Page 46: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ScopesScopes

Brit. Virgin IslandsBrit. Jungferninseln

CaribbeanKaribik

Great BritainGroßbritannien

ImageBild

MapKarte

ArticleArtikel

SurfBVIBVI Welcome CaribNet

Geo ContainmentGeo Umschließung

Political DependencyPolitische Abhängigkeit

Page 47: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ScopesScopes

Brit. Virgin IslandsBrit. Jungferninseln

CaribbeanKaribik

Great BritainGroßbritannien

ImageBild

MapKarte

ArticleArtikel

SurfBVIBVI Welcome CaribNet

Geo ContainmentGeo Umschließung

Political DependencyPolitische Abhängigkeit

Scopes

Page 48: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ScopesScopes

Brit. Virgin IslandsBrit. Jungferninseln

CaribbeanKaribik

Geo ContainmentGeo Umschließung

Great BritainGroßbritannien

Political DependencyPolitische Abhängigkeit

ImageBild

MapKarte

ArticleArtikel

SurfBVIBVI Welcome CaribNet

Names:English

Deutsch

Scopes

Page 49: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ScopesScopes

Brit. Virgin IslandsBrit. Jungferninseln

CaribbeanKaribik

Geo ContainmentGeo Umschließung

Great BritainGroßbritannien

Political DependencyPolitische Abhängigkeit

ImageBild

MapKarte

ArticleArtikel

SurfBVIBVI Welcome CaribNet

Names:English

Deutsch

Scopes

Occurrences:Public

Confidential

Page 50: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ScopesScopes

Brit. Virgin IslandsBrit. Jungferninseln

CaribbeanKaribik

Geo ContainmentGeo Umschließung

Great BritainGroßbritannien

Political DependencyPolitische Abhängigkeit

ImageBild

MapKarte

ArticleArtikel

SurfBVIBVI Welcome CaribNet

Names:English

Deutsch

Scopes

Occurrences:Public

Confidential

Associations:Geography

Politics

Page 51: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Scope Examples: Scope Examples: EnglishEnglish, , PublicPublic, Politics, Politics

Brit. Virgin IslandsCaribbean

Geo Containment

Great Britain

Political Dependency

Image

Map

Article

SurfBVIBVI Welcome CaribNet

Names:English

Deutsch

Scopes

Occurrences:Public

Confidential

Associations:Geography

Politics

Page 52: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

In-/Semi-formal approaches: In-/Semi-formal approaches: Topic Maps, ThesauriTopic Maps, Thesauri

AdvantagesAdvantages

• Capture a lot of modeling Capture a lot of modeling experiencesexperiences

• IntuitiveIntuitive

• Interesting primitives that Interesting primitives that are not available in other are not available in other approaches (TM)approaches (TM)

DisadvantagesDisadvantages

• No characterization No characterization independent from particular independent from particular implementationimplementation

• May be misinterpreted May be misinterpreted (TM) / few primitives (TM) / few primitives (Thesauri)(Thesauri)

Page 53: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Common errors about Common errors about ontology representation languagesontology representation languages

AI people‘s errorsAI people‘s errors

• „„it is good if it is formal“it is good if it is formal“

• „„it is good if someone with a it is good if someone with a logic background may easily logic background may easily use it“use it“

• „„it is good if the language it is good if the language allows everything“allows everything“

Engineer‘s errorsEngineer‘s errors

• „„it works in my application, it works in my application, thus it is good“thus it is good“

• „„who needs formality who needs formality anyway?“anyway?“

• „„it did not work when I it did not work when I looked at it 10 years ago“looked at it 10 years ago“

Page 54: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Review/Introduction:Review/Introduction:(Classical) First-order [Predicate] Logic:(Classical) First-order [Predicate] Logic:

Short: FO or PL1Short: FO or PL1

Page 55: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

But first: Propositional Logic: SyntaxBut first: Propositional Logic: Syntax

• propositionspropositions (no internal structure) can be assigned a (no internal structure) can be assigned a truth-valuetruth-value: : – either true or false (classical 2-valued logic: tertium non datur)

• Logical symbols:Logical symbols:– conjunction: , disjunction: , negation: , – implication: , equivalence: , parentheses:

• Non-logical symbols:Non-logical symbols:– propositional variables p, q, r, ... – signature: set of propositional variables = {p, q, r, ...}

• Formation rules for well-formed formulas (wff)Formation rules for well-formed formulas (wff)– an atomic formula (propositional variable) is a formula– if F, G are formulas, so are:

• FG, F G, F, FG , FG, F

propositional logic<logic> (or "propositional calculus") A system of symbolic logic using symbols to stand for whole propositions and logical connectives. Propositional logic only considers whether a proposition is true or false. In contrast to predicate logic, it does not consider the internal structure of propositions. http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?propositional+logic

Page 56: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Propositional Logic: SemanticsPropositional Logic: Semantics

• An An interpretation I interpretation I over a signature over a signature is a mapping is a mapping– I: {true, false} , associating a truth value to every

propositional variable

• Truth tablesTruth tables describe how to extend describe how to extend I I from to from to composite formulas (Boolean Algebra):composite formulas (Boolean Algebra):– FG, F G, F, FG , FG

Page 57: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Boolean Algebra, Truth TablesBoolean Algebra, Truth Tables

http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?two-valued+logic

Page 58: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Syntax of First-Order Logic (FO)Syntax of First-Order Logic (FO)

• Logical symbols:Logical symbols: , , , , , , (“for all”), (“exists”), ...

• Non-logical symbols: A FO Non-logical symbols: A FO signature signature consists of consists of– constant symbols: a,b,c, ...

– function symbols: f, g, ...

– predicate (relation) symbols: p,q,r, ....

function and predicate symbols have an associated arity;

– we can write, e.g., p/3, f/2 to denote the ternary predicate p and the function f with two arguments

• First-order First-order variablesvariables: x, y, ...: x, y, ...

• Formation rules for Formation rules for termsterms::– constants and variables are terms

– if t_1,...t_k are terms and f is a k-ary function symbols then f(t_1,...,t_k) is a term

Page 59: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Syntax of First-Order Logic (FO)Syntax of First-Order Logic (FO)

• Formation rules for Formation rules for formulasformulas::– if t_1,...t_k are terms and p/k is a predicate symbol (of arity k)

then p(t_1,...,p_k) is an atomic formula (short: atom)• all variable occurrences in p(t_1,..., t_k) are free

– if F,G are formulas and x is a variable, then the following are formulas:

– FG, F G, F, FG , FG, F , x: F (“for all x: F(x,...) is true”) x: F (“there exists x such that F(x,...) is true”)

– the occurrences of a variable x within the scope of a quantifier are called bound occurrences.

Page 60: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ExamplesExamples

x malePerson(x) x malePerson(x) person(x). person(x).

malePerson(bill).malePerson(bill).

child(marriage(bill,hillary),chelsea).child(marriage(bill,hillary),chelsea).

Variable: xVariable: x

Constants (0-ary function symbols): bill/0, hillary/0, chelsea/0Constants (0-ary function symbols): bill/0, hillary/0, chelsea/0

Function symbols: marriage/2Function symbols: marriage/2

Predicate symbols: malePerson/1, person/1, child/2Predicate symbols: malePerson/1, person/1, child/2

Page 61: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Semantics of Predicate LogicSemantics of Predicate Logic

• Let D be a non-empty Let D be a non-empty domaindomain (a.k.a. (a.k.a. domain of domain of discoursediscourse, , universeuniverse). A ). A structurestructure is a pair is a pair I I = (D,I), with = (D,I), with an an interpretationinterpretation I that maps ... I that maps ...– each constant c to an element I(c) D– each predicate symbol p/k to a k-ary relation I(p) Dk,– each function symbol f/k to a k-ary function I(f): DkD

• Given a structure Given a structure I, I, and a set of variables X, a and a set of variables X, a valuationvaluation is a mapping val: X is a mapping val: X D, used to evaluate terms and D, used to evaluate terms and formulas over a given FO signature formulas over a given FO signature – with this: term evaluation val(t) yields a domain element, and

formula evaluation val(F) yields a truth value

Page 62: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

ExampleExample

Formula F = Formula F = x malePerson(x) x malePerson(x) person(x). person(x).

Domain D = {b, h, c, d, e}Domain D = {b, h, c, d, e}

Let’sLet’s pick an interpretation I: pick an interpretation I: I(bill) = b, I(hillary) = h, I(chelsea) = c

I(person) = {b, h, c}

I(malePerson) = {b}

Under this I, the formula F evaluates to Under this I, the formula F evaluates to truetrue..

• If we choose IIf we choose I’ like I but I’(malePerson) = {b,d}, then F ’ like I but I’(malePerson) = {b,d}, then F evaluates to evaluates to falsefalse

• Thus, I is a Thus, I is a model model of F, while I’ is not:of F, while I’ is not:– I |= F I’ |=/= F

Page 63: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

FO Semantics (cont’d)FO Semantics (cont’d)

• F F entailsentails G (G is a G (G is a logical consequencelogical consequence of F) if every model of F) if every model of F is also a model of G: F |= Gof F is also a model of G: F |= G

• F is F is consistent consistent or or satisfiablesatisfiable if it has at least one model if it has at least one model

• F is F is valid valid or a or a tautology tautology if every interpretation of F is a model if every interpretation of F is a model

Proof TheoryProof Theory: :

Let F,G, ... be FO Let F,G, ... be FO sentences sentences (no free variables). (no free variables).

Then the following are equivalent:Then the following are equivalent:

1.1. F_1, ..., F_k |= GF_1, ..., F_k |= G

2.2. F_1 F_1 ... ... F_k F_k G is valid G is valid

3.3. F_1 F_1 ... ... F_k F_k G is unsatisfiable (inconsistent) G is unsatisfiable (inconsistent)

Page 64: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Proof TheoryProof Theory

• A A calculus calculus is formal proof system to establish is formal proof system to establish – F_1, ..., F_k |= G

• via formal (syntactic) via formal (syntactic) derivationsderivations – F_1, ..., F_k |– ... |– G, where the “|–” denotes allowed proof steps

• Examples: Examples: – Hilbert Calculus, Gentzen Calculus, Tableaux Calculus, Natural

Deduction, Resolution, ...

• First-order logic is “semi-decidable”:First-order logic is “semi-decidable”:– the set of valid sentences is recursively enumerable, but not recursive

(decidable)

• Some inference engines:Some inference engines:– http://www.semanticweb.org/inference.html

Page 65: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Description LogicsDescription LogicsDecidable Fragments of FODecidable Fragments of FO

(aka (aka terminological logicsterminological logics,,member ofmember of concept languages concept languages))

Page 66: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Formalism for Ontologies: Description LogicFormalism for Ontologies: Description Logic

• DL definition of “Happy Father” DL definition of “Happy Father” (Example from Ian Horrocks, U Manchester, UK)(Example from Ian Horrocks, U Manchester, UK)

Page 67: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Description Logic Statements as RulesDescription Logic Statements as Rules

• Another syntax: first-order logic in rule form (implicit quantifiers):Another syntax: first-order logic in rule form (implicit quantifiers):happyFather(X)

man(X), child(X,C1), child(X,C2), blue(C1), green(C2),

not ( child(X,C3), poorunhappyChild(C3) ).

poorunhappyChild(C)

not rich(C), not happy(C).

• Note: Note: – the direction “” is implicit here (*sigh*)

– see, e.g., Clark’s completion in Logic Programming

Page 68: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Description LogicsDescription Logics

• Terminological Knowledge (TBox)Terminological Knowledge (TBox)– Concept Definition (naming of concepts):

– Axiom (constraining of concepts):

=> a mediators “glue knowledge source”

• Assertional Knowledge (ABox)Assertional Knowledge (ABox)– the marked neuron in image 27

=> the concrete instances/individuals of the concepts/classes that your sources export

Page 69: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Querying vs. ReasoningQuerying vs. Reasoning

• Querying: Querying: – given a DB instance I (= logic interpretation), evaluate a query

expression (e.g. SQL, FO formula, Prolog program, ...)– boolean query: check if I |= (i.e., if I is a model of ) – (ternary) query: { (X, Y, Z) | I |= (X,Y,Z) } => check happyFathers in a given database

• Reasoning:Reasoning:– check if I |= implies I |= for all databases I, – i.e., if => – undecidable for FO, F-logic, etc.– Descriptions Logics are decidable fragments concept subsumption, concept hierarchy, classification semantic tableaux, resolution, specialized algorithms

Page 70: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Formalizing Glue Knowledge:Formalizing Glue Knowledge:Domain Map for Domain Map for SYNAPSESYNAPSE and and NCMIRNCMIR

Domain Map = labeled graph with concepts ("classes") and roles ("associations")• additional semantics: expressed as logic rules

Domain Map = labeled graph with concepts ("classes") and roles ("associations")• additional semantics: expressed as logic rules

Domain Map (DM)

Purkinje cells and Pyramidal cells have dendritesthat have higher-order branches that contain spines.Dendritic spines are ion (calcium) regulating components.Spines have ion binding proteins. Neurotransmissioninvolves ionic activity (release). Ion-binding proteinscontrol ion activity (propagation) in a cell. Ion-regulatingcomponents of cells affect ionic activity (release).

Domain Expert Knowledge

DM in Description Logic

Page 71: CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration

CSE-291: Ontologies in Data Integration

Source Contextualization & DM Source Contextualization & DM RefinementRefinement

In addition to registering (“hanging off”) data relative toexisting concepts, a source may also refine the mediator’s domain map...

sources can register new concepts at the mediator ...