logics for data and knowledge representation the dera methodology for the development of domain...

28
Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia and Biswanath Dutta Modified by Feroz Farazi

Upload: russell-thomas

Post on 26-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Logics for Data and Knowledge Representation

The DERA methodology for the development of domain ontologies

Feroz Farazi

Originally by Fausto Giunchiglia and Biswanath DuttaModified by Feroz Farazi

Page 2: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Knowledge Representation (KR) Abstraction of the world via models, of a particular

domain or problem, which allow automatic reasoning and interpretation

Fundamental Goal to represent knowledge in a manner that facilitates

inferencing new knowledge (i.e. drawing conclusions) from the already known facts possibly encoded in a knowledge base

2

Page 3: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

According to (Crawford & Kuipers, 1990): A knowledge representation system must have a reasonably compact syntax a well defined semantics so that one can say

precisely what is being represented sufficient expressive power to represent human

knowledge an efficient, powerful and understandable reasoning

mechanism support in building large knowledge bases

3

Knowledge Representation Properties

Page 4: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Knowledge Representation Issues KR issues:

How do people represent knowledge? What is the nature of knowledge? Do we have domain specific schema or generic, domain

independent schema? How much it needs to be expressive?

4

Page 5: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Ontology “formal, explicit specification of a shared

conceptualisation” [T. R. Gruber, 1993]

Models a domain consisting of a shared vocabulary with the definition of objects and/or concepts and their properties and relations

A structural framework for organizing information, and used as a form of KR in the fields like, AI, SW, Lib. Sc., Inf.

Architecture, etc.

Can be used also as a language resource

5

Page 6: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Ontology Properties Some of the ontological properties are:

Extendable

Reusable

Flexible

Robust …

6

Page 7: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Domain An area of knowledge or field of study that we are

interested in or that we are communicating about

Example: Computer science, Artificial Intelligence, Soft computing,

Social networks, …Library science, Mathematics, Physics, Chemistry, Agriculture, Geography, …

Music, Movie, Sculpture, Painting, …Food, Wine, Cheese, …Space,…

7

Page 8: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Domain A domain can be decomposed into its several

constituents, and Each of them denotes a different aspect of entities

An example from Space domain: by region, by body of water, by landform, by populated places, by administrative division, by land, by agricultural land, by facility, by altitude, by climate,…

Each of these aspects is called facet

8

Page 9: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Facet A hierarchy of homogeneous terms describing an

aspect of the domain, where each term in the hierarchy denotes a different concept

E.g., Body of water(e.g., River, Lake, Pond, Canal), Landform

(e.g., mountain, hill, ridge), facility (e.g., house, hut, farmhouse, hotel, resort), etc.

language facet (e.g., English, Hindi, Italian,), property facet, author facet, religion facet (e.g., Christian, Hindu, Muslim), commodity facet, etc.

Page 10: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

DERA a facet based knowledge organization framework independent from any specific domain allows building domain specific ontologies mapping to Description Logic logically sound decidable

Developed by the UniTn KnowDive group

10

Page 11: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

DERA Surface Structure In the surface level, it has the following components:

D – Domain E – Entity R – Relation A – Attribute

11

Domain (D) A DERA domain is a tuple of,

D = <E, R, A>

Page 12: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Entity (E) an elementary component that consists of entity

classes and their instances, having either perceptual correlates or only conceptual existence in a domain in context. It can be represented as a pair

E = <C , E'> Where,

C = a set of entity classes or concepts representing the entities

E' = a set of entities (also called objects, instances or individuals), possibly, real world named entities, those are the instantiations of C

12

Page 13: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Entity (E) Entity classes (C) :

Represent the essence of the domain under consideration;

Consist of the core classes representing a domain in context

E.g., Consider the following classes in context of Space domain: Mountain, Hill, Lake, River, Canal, Province, City, Hotel,...

13

Page 14: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Entity (E) Entity (E') :

the real world named entities representations of the real world entities

E.g., The Himalaya, Monte Bondone, Lake Garda, Trento, Povo, Hotel

America,...

14

Page 15: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Entity (E)

15

An example from the Space domain

Page 16: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Relation (R) An elementary component consists of classes

representing relations between entities

R = <{r}>

{r} is a set of relations A relation r is a link between two entities (E') Builds a semantic relation between the entities

E.g., Some relations (spatial) from Space domain: near,

adjacent, inside, before, center, sideways, etc.

16

Page 17: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Attribute (A) An elementary component consists of classes

expressing the characteristics of entities

A = <A', C> Where A' is a set of datatype attributes and C is a

set of descriptive attributes An attribute is any property, qualitative, quantitative

or descriptive measure of an entity

17

Page 18: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Attribute (A) (contd…) Datatype Attributes (A'):

The datatype attributes include the attribute classes that account the quality or quantity of an entity within a domain

E.g., latitude, longitude (of a place):

450 N, 180 S altitude (of a mountain):

8000ft, 2400m. high, low

depth (of a lake): deep, shallow 100ft., 20m.

18

Page 19: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Attribute (A) (contd…) Descriptive Attributes (C):

include the attribute classes that describe the entities under a domain in consideration

value could consist of a single string (single valued) or a set of strings (multivalued)

E.g., natural resource (of a place):

coal, natural gas, oil, … architectural style (of a castle):

{Classical architecture, Greek architecture, Roman architecture, Bauhaus, etc.}

history (of a place) ……….

climbing route (to a mountain) ……………….

19

Page 20: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Mapping From DERA to DL

Entity classes (C) -> Concepts Relations (R) -> Roles Datatype attributes (A') -> Roles Descriptive attributes (C) -> Roles Entity (E') -> Individuals

20

Page 21: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Methodology Step 1: Identification of the atomic concepts Step 2: Analysis (per genus et differentiam) Step 3: Synthesis Step 4: Standardization Step 5: Ordering

Following the above steps leads to the creation of a set of facets. They constitute a faceted representation scheme for a domain

21

Page 22: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Ontological Principle Relevance (e.g.,breed is more realistic to classify the universe of cows

instead of by grade) Ascertainability (e.g., flowing body of water) Permanence (e.g., Spring- a natural flow of ground water) Exhaustiveness (e.g., to classify the universe of people, we need both

male and female) Exclusiveness (e.g., age and date of birth, both produce the same

divisions) Context (e.g., bank, a bank of a river, OR, a building of a financial

institution) Important: helps in reducing the homographs

Currency (e.g., metro station vs. subway station) Reticence (e.g., minority author) Ordering

Important: ordering carries semantics as it provides implicit relations between the coordinate terms

22

Page 23: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Identification of the atomic concepts

Sources of the concepts WordNet GeoNames TGN Literature

23

Page 24: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Identification of the atomic concepts

Some of the relevant sub-trees in WordNet are: location artifact, artefact body of water, water geological formation, formation land, ground, soil land, dry land, earth, ground, solid ground, terra firma

Note: not necessarily all the nodes in these sub-trees need to be part of the space domain. For example, the descendants of artifact, like, article, anachronism, block, etc. are not.

24

Page 25: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Hill Stream River

• the well defined elevated land

• formed by the geological formation (where geological formation is a natural phenomenon)

• altitude in general >500m

• the well defined elevated land

• formed by the geological formation, where geological formation is a natural phenomenon

• altitude in general <500m

• a body of water

• a flowing body of water

• no fixed boundary

• confined within a bed and stream banks

• a body of water

• a flowing body of water

• no fixed boundary

• confined within a bed and stream banks

• larger than a brook

Mountain

Analysis

25

Page 26: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Body of water

Flowing body of waterStream

BrookRiver

Stagnant body of waterPond

Landform

Natural depressionOceanic depression

Oceanic valleyOceanic trough

Continental depressionTroughValley

Natural elevationOceanic elevation

SeamountSubmarine hill

Continental elevationHillMountain

* each term in the above has gloss and is linked to synonym(ous) terms in the knowledge base

Synthesis

26

Page 27: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

Space [Domain] by geographical feature [Entity class]

by water formation by land formation by land by administrative division …

by relations [Relation] spatial relation

direction, internal, external, longitudinal, sideways, etc. functional relation (e.g., primary inflow, primary outflow) …

by attribute [Datatype attribute]

latitude Longitude dimension …

[Descriptive attribute] Natural resource Architectural style Time zone ph History …

Facets and sub-facets

27

Log-in: http://uk.disi.unitn.it/resources/html/UKDomain.html

Page 28: Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia

References F. Giunchiglia and B. Dutta. DERA: A Faceted Knowledge Organization

Framework. Technical report, KnowDive, DISI, University of Trento, 2010.

B. Dutta, F. Giunchiglia, V. Maltese, A facet-based methodology for geo-spatial modelling, GEOS, 2011.

Crawford, J. M. & Kuipers, B. (1990). ALL: Formalizing Access Limited Reasoning. Principles of semantic networks: Explorations in the representation of knowledge, Morgan Kaufmann Pub., 299-330.

S. R. Ranganathan. Prolegomena to Library Classification. Asia Publishing House, 1967.

T. R. Gruber. A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220, 1993.