4 knowledge representation · representing knowledge on the web. the web a graph to a computer, ......

Post on 02-Jun-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Knowledge representationA.Y. 2019/2020

KR in a nutshell

• The field of AI dedicated to represent information about the world in a form that a computer system can use it to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language.

More about KR

● KR is as old as human reflection on what humans do, and how to improve or share it

● KR involves mutual dependencies between human cognition, language/data, and the world

● KR makes it emerge patterns of representation (e.g. ontologies) and reasoning (logics)

● In recent times, known as KRR: knowledge representation and reasoning ● K is more than information, because reasoning allows to infer new information

out of knowledge

K genes?

Protein FOXP2 at the origin of language evolution in humans between 300k and 50k years ago?

K neurons?

Imitation-recognition-anticipation mechanisms provided by mirror neurons that activate both when doing and perceiving actions?

K neurons?

Language is hard to “pack”, reasoning is left to cognitive means

Need for schemas

● Need for more sophisticated, task-oriented KR: judicial courts, money and commerce, political spin doctors

● Axial age (from about 500BC to 400AD) ● Emergence in different parts of the world

● Greek philosophers and sophists, Indian grammarians, Chinese political scientists

● Evolution in high middle age: Islamic post-aristotelian scientists, etc.

K visualization

● Beyond (natural) language, what K?

● Trees, paths, maps, tables, frames, wheels, ...

Trees: Porphyry’s trees

● A table of the coordination and subordination of genera and species, managed by moving from the genus in a similar way and going down to the lowest species according to the dichotomy process (for example, the substance is divided into corporeal and incorporeal, the corporeal into animated and inanimate, the one animated in sensitive and insensitive, etc.). The logical construction, translated into a figurative scheme, shows itself as the bifurcations of a tree.

Porphyry’s trees at work

Generic trees

● Porphyry’s tree operates by distinction ● each idea is distinguished in two opposite varieties, e.g. Animal:

Rational/Irrational

● A generic tree operates by arbitrary branching criteria, e.g. in the Encyclopédie L’art de penser branches from La Logique; La Morale branches from La Volonté

Decision trees

● A decision tree operates by decision criteria, and assumes a direction

● e.g. Color:Red; Size:Medium → Apple

Syntactic trees

● A syntactic tree depends on a grammar that is used to parse a sentence

● S → NP → NN: Protein

Maps: semantic networks

● A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network

Maps: concept maps

● A concept map is a type of graphic organiser used to help one organise and represent knowledge of a subject.

● Concept maps begin with a main idea (or concept) and then branch out to show how that main idea can be broken down into specific topics.

Rationale

● Semantic networks and concept maps try to create a concept language from linguistic terms and sentences

● Typically, nouns are represented as nodes, verbs/prepositions as links ● e.g. (water) – changes – (state) but just “typically”

Map: E-R diagrams

● Describes interrelated things of interest in a specific domain of knowledge.

● A basic ER model is composed of entity types (which classify the things of interest) and specifies relationships that can exist between entities (instances of those entity types).

Maps: activity diagrams

● Activity Diagrams describe how activities are coordinated to provide a service which can be at different levels of abstraction.

● Typically, an event needs to be achieved by some operations, particularly where the operation is intended to achieve a number of different things that require coordination.

Maps: activity diagrams

Rationale

● ER and activity diagrams are data modelling patterns, which impose some restrictions on the freedom of concept maps ● e.g. (vehicle) –[1] fitted [n]– (options) ● (some) logical background! (cf. next classes)

Ontology

● Historically ontology, listed as part of metaphysics, is the philosophical study of the nature of being, becoming, existence, or reality, as well as the basic categories of being and their relations.

● Ontology deals with questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences.

Ontology in AI

● While the term ontology has been rather confined to the philosophical sphere in the recent past, it is now gaining a specific role in a variety of fields of Computer Science, such as Artificial Intelligence, Computational Linguistics, and Database Theory and Semantic Web.

● In Computer Science the term loses part of its metaphysical background and, still keeping a general expectation that the features of the model in an ontology should closely resemble the real world, it is referred as a formal model consisting of a set of types, properties, and relationship types aimed at modelling objects in a certain domain or in the world.

Philosophy Vs. Computer Science

● Philosophy: Ontology is the philosophical study of being. More broadly, it studies concepts that directly relate to being, in particular becoming, existence, reality, as well as the basic categories of being and their relations.

● Computer Science: a formal schema that provides classes (types of entities) and properties (relations among entities) and allows to model the knowledge within a certain domain of interest

Ontology: some definitions

● Tom Gruber. A translation approach to portable ontology specifications. “an ontology is a formal, explicit specification of a shared conceptualisation. An ontology is a description (like a formal specification of a program) of the concepts and relationships that can formally exist for an agent or a community of agents”

● Nicola Guarino. A translation approach to portable ontology specifications. “[a conceptualization] contains many “world structures” one for each world. It has both extensional and intentional components”

Example of ontology

T-Box (i.e. terminological component)

A-Box (i.e. “assertion component)

• Share common understanding over a domain

• Enable reuse of domain knowledge

Representing knowledge on the Web

The Web a graph

● To a computer, then, the web is a flat, boring world devoid of meaning

The traditional web

<p> … 600 content standards, the BioSharing registry (<a href="https://biosharing.org/"> https://biosharing.org/ </a>) can be of use as it describes the standards in detail, including versions where applicable. </p>

https://www.nature.com/articles/sdata201618

<https://www.nature.com/articles/sdata201618> <https://biosharing.org/>

@href

Links are meaningful

● This is a pity, as in fact documents on the web describe real objects and imaginary concepts, and give particular relationships between them.

The semantic web

● Adding semantics to the web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values.

Resource Description Framework (RDF)

RDF

● RDF is a data model (some times it is improperly called “language”)

● It is based on triples subject-predicate-object called statements ● “Umberto Eco is author of The name of the rose” can be expressed

through an RDF statement assigning to ● “Umberto Eco” the role of subject ● “is author of” the role of predicate ● “The name of the rose” the role of object

RDF key concepts● Resource → it is an object we want to talk about, and it is identified by an IRI

e.x. <https://www.nature.com/articles/sdata201618> ● Property → it is a special type of resource, and it is used to describe relation

between resources. It is identified by an IRI e.x. <http://purl.org/spar/cito/citesAsPotentialSolution>

● Statements → they assert properties between resources. Each statement is a triple subject-predicate-object, where the subject is a resource, the predicate is a property and the object is either a resource or a literal (i.e., a string)

● RDF Graph → a set of RDF statements ● A file that contains RDF statements represents an RDF graph ● IRIs contained in different graph actually refer to the same resource

● Triplestore → it is a database built for storing and retrieving statements RDF (and can contain one or more RDF graphs)

RDF graphical representation

● One of the possible representations of RDF statements, which is very intuitive for humans, is that of semantic networks. A semantic network is, actually, a graph where the subjects and objects of RDF statements are represented by nodes, while the direct edge linking those represents the predicate

● This graphical notation can be used to declare semantic relations explicitly

RDF serialised as Turtle

● Turtle is a particular syntax to express RDF statements is a simple and intuitive way

● Turtle is based on the three components of a statements, followed by a “.” ● For instance, the example introduced in the previous slide can be expressed

as follows:

Resource Description Framework (Schema)

● RDF Schema (Resource Description Framework Schema, variously abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources.

RDF(s) Basic patterns in First-Order Logic (FOL)

● Class membership ● ⊢ Donald_Duck ∈ Duck() ⇄ myont:Donald_Duck rdf:type myont:Duck

● Assertions ● ⊢ nephewOf(Donald_Duck,Uncle_Scrooge) ⇄

myont:Donald_Duck myont:nephewOf myont:Uncle_Scrooge ● ⊢ label(Donald_Duck,Paperino) ⇄ myont:Donald_Duck rdfs:label “Paperino”

● Taxonomies ● ⊢ Old_duck ⊑ Duck ⇄ myont:Old_duck rdfs:subClassOf myont:Duck

RDF(s) Basic patterns in First-Order Logic (FOL)

● Taxonomical relations for predicates ● ⊢ nephewOf ⊑ relative ⇄ myont:nephewOf rdfs:subPropertyOf

myont:relative

● Interpretation is similar to predicate logic, but: ● rdfs:subClassOf and rdfs:subPropertyOf are reflexive (not like implication) ● rdfs:Resource (e.g. myont:Donald_Duck) and rdfs:Literal (e.g. “Paperino”)

are disjoint

● These patterns are part of RDF1.1 semantics ● RDF1.1 semantics is a W3C recommendation (http://www.w3.org/TR/rdf11-mt/)

Liberality of RDF

● You can declare any fact in RDF, even “unusual”, “counterintuitive”, “abnormal”, or “invalid” ones ● :Wile rdf:type :blonde; :brunette ● :ACME :fires :ACME ● :ACME :fires :mammal ● :ACME rdf:type owl:Class

● Liberality is great, but has costs: no way to make a machine detect invalid or unintended facts, or undesired inferences

OWL constraints

● In order to limit (and to guide) the design of ontologies, OWL (Ontology Web Language) restricts the expressivity of RDF

● No more is any triple allowed, but only those that respect the constraints (patterns) of OWL formal semantics

● Undesired facts will then be detected, if the design of the ontology reflects the conceptualisation of the users

● That’s why we need good design

OWL example

Formal semantics

● ACME is a service and services are facilities. Service is a class. Facilities are social structures ● ⊢ :ACME ∈ :Service ● ⊢ :Service ⊑ :Facility ● ⊢ :Facility ⊑ : SocialStructure

● Wile is a person, persons are not services ● ⊢ :Wile ∈ :Person ● ⊢ :Person ⊓ Service ≣ Ø

OWL core patterns● Inheritance

● ⊢ :Service ⊑ : SocialStructure ● Consistency check

● ⊬ ACME ∈ owl:Class ● ⊬ ACME ∈ :Person ● ⊬ Wile ∈ :Service

● Coherence ● ⊬ Perservice ⊑ :Person ⊓ :Service

● Domain, ranges, inverses, subproperties ● ⊢ :subscribesTo ⊑ :Person × :Service ● ⊢ :subscribesTo ⊑ :signsContractWith ● ⊢ :isSybscribedBy ≣ :subscribesTo—

OWL core patterns (contd)● Materialisation

● ⊢ :subscribesTo(:Wile, :ACME) → :isSubscribedBy(:ACME, :Wile) ● ⊢ :subscribesTo(:Wile, :ACME) → :signsContractWith(:Wile, :ACME)

● Classification ● ⊢ :subscribesTo(:Beep, :ACME) → :Beep ∈ :Person

Open World Assumption

● Since the Web is an open world, if we say something that is not explicitly put in our axioms, we cannot exclude it, and then we have to add a new axiom, e.g. a disjointness axiom, to obtain an “integrity check”: ⊢ :ACME :subscribesTo :ACME → ACME ∈ :Person

● i.e. this generates an inconsistency, since Person and Service are disjoint classes

● Note that domains and ranges are not constraints, therefore we cannot use them as integrity checks

● Besides consistency checking, extra expressivity is needed for that

Boolean class patterns● Let’s use General Class Inclusion:

<ClassConstructor> ⊑ <Class ⊓ ClassConstructor>

● > Mammals that are also aquatic organisms ● > Aquatic mammals, fishes or crustaceans ● > Non-fishes

Relational class constructors (GCI patterns)

● Things that only live in a marine habitat ⊢ ∀ livesIn.MarineHabitat ⊑ Thing

● Things that live in at least one marine habitat ⊢ ∃ livesIn.MarineHabitat ⊑ Thing

● Things that live in the Indian Ocean ⊢ ∀ livesIn.{IndianOcean} ⊑ Thing

● Things that live in either the Indian or Pacific Ocean ⊢ ∀ livesIn.{IndianOcean PacificOcean} ⊑ Thing

Questions

top related