how to build an ontology

133
1 How to Build an Ontology Barry Smith http://ontology.buffalo.edu/smith

Upload: eliot

Post on 18-Jan-2016

56 views

Category:

Documents


5 download

DESCRIPTION

How to Build an Ontology. Barry Smith http://ontology.buffalo.edu/smith. Mission of the NCBO. To create software and support services for science-based ontology development and use in the biomedical domain - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: How to Build an Ontology

1

How to Build an Ontology

Barry Smith

http://ontology.buffalo.edu/smith

Page 2: How to Build an Ontology

2

Mission of the NCBOTo create software and support services for

science-based ontology development and use in the biomedical domain

Science-based = ontologies for support of scientific research (taken as encompassing evidence-based medicine)

Science-based = using the scientific method as part of the process of ontology development and testing

Page 3: How to Build an Ontology

5

Scientific ontologies have special features

Every term in a scientific ontology must be such that the developers of the ontology believe it to refer to some entity on the basis of the best current evidence

Page 4: How to Build an Ontology

6

For scientific ontologies

reusability is crucial

compatibility with neighboring scientific ontologies is crucial it should not be too

easy to add new terms to an ontology

we want to introduce these features in clinical medicine ...

Page 5: How to Build an Ontology

10

An Ontological SquareUpper-level integrating ontologies

Domain ontologies

Page 6: How to Build an Ontology

11

An Ontological SquareUpper-level integrating ontologies

Domain ontologies

Ontologies in support of science

Administrative ontologies

Page 7: How to Build an Ontology

12

An Ontological SquareUpper-level integrating ontologies

Domain ontologies

Ontologies in support of science

BFO (Basic Formal Ontology)

DOLCE

SNOMED

SwissProt

FMA

Administrative ontologies(for e-commerce, etc.)

FOAF top level:

person, topic, document, primary topic ...

Amazon.com ontology

Library of Congress Catalog

Page 8: How to Build an Ontology

13

Problem of ensuring sensible cooperation in a massively interdisciplinary community

concepttypeinstancemodelrepresentationdata

Page 9: How to Build an Ontology

14

RetailPrice hasA Denomination InstanceOf Dollar (p. 101)

SI-Unit instanceof System-of-Units (p. 40)

from Handbook of Ontology(Semantic Web approach)

Page 10: How to Build an Ontology

15

from: Ontological Engineering(Semantic Web approach)

location =def. a spatial point identified by a name (p. 12)

arrivalPlace =def. a journey ends at a location (p. 13)

facet =def. ternary relation that holds between a frame, a slot, and the facet (p. 51)

Page 11: How to Build an Ontology

16

Entity =def

anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software (Levels 1, 2 and 3)

Page 12: How to Build an Ontology

17

First basic distinction

universal vs. instance

(science text vs. diary)

(man vs. Maximilian)

Page 13: How to Build an Ontology

18

Instances databases

For scientific ontologies

it is generalizations that are important = universals, types, kinds, species

Page 14: How to Build an Ontology

19

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

Catalog vs. inventory

Page 15: How to Build an Ontology

20

Catalog vs. inventory

Page 16: How to Build an Ontology

Catalog of Universals/Types

Page 17: How to Build an Ontology

22

Ontology Universals Instances

Page 18: How to Build an Ontology

23

Ontology = A Representation of Universals

Page 19: How to Build an Ontology

24

Ontology = A representation of universals

Each node of an ontology consists of:

• preferred term (aka term)

• term identifier (TUI, aka CUI)

• synonyms

• definition, glosses, comments

Page 20: How to Build an Ontology

25

An ontology is a representation of universals

We learn about universals in reality from looking at the results of scientific experiments in the form of scientific theories – which describe not what is particular in reality but what is general

Page 21: How to Build an Ontology

siamese

mammal

cat

organism

substanceuniversals

animal

instances

frogleaf class

Page 22: How to Build an Ontology

27

Domain =def

a portion of reality that forms the subject-matter of a single science or technology or mode of study or administrative practice ...;

proteomics

HIV

epidemiology

Page 23: How to Build an Ontology

28

Representation =def

an image, idea, map, picture, name or description ... of some entity or entities.

Page 24: How to Build an Ontology

29

Ontologies are representational artifacts

comparable to science texts

Page 25: How to Build an Ontology

33

Periodic Table

The Periodic Table

Page 26: How to Build an Ontology

34

Ontologies are here

Page 27: How to Build an Ontology

35

or here

Page 28: How to Build an Ontology

36

What do ontologies represent?

Page 29: How to Build an Ontology

37

Ontologies do not represent concepts in people’s heads

Page 30: How to Build an Ontology

38

They represent universals in reality

Page 31: How to Build an Ontology

39

“leg” is not the name of a concept

concepts do not stand in

part_of

connectedness

causes

treats ...

relations to each other

Page 32: How to Build an Ontology

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

instances

universals

Page 33: How to Build an Ontology

41

Inventory vs. Catalog:Two kinds of composite representational artifacts

Databases represent instances

Ontologies represent universals

Page 34: How to Build an Ontology

42

How do we know which general terms designate universals?

Roughly: terms used by scientists to designate entities about which we have a plurality of different kinds of testable proposition

(cell, electron ...)

Page 35: How to Build an Ontology

43

Problem: fiat demarcations

male over 30 years of age with family history of diabetes

abnormal curvature of spine

participant in trial #2030

Page 36: How to Build an Ontology

44

Problem: roles

fist

patient

FDA-approved drug

Page 37: How to Build an Ontology

45

Administrative ontologies often need to go beyond universals

Fall on stairs or ladders in water transport injuring occupant of small boat, unpowered

Railway accident involving collision with rolling stock and injuring pedal cyclist

Nontraffic accident involving motor-driven snow vehicle injuring pedestrian

Page 38: How to Build an Ontology

46

universals vs. classes

universals

{a,b,c,...} classes

Page 39: How to Build an Ontology

47

Class =defa maximal collection of particulars determined by a general term (‘cell’. ‘electron’), (‘ ‘restaurant in Palo Alto’, ‘Italian’)

the class A = the collection of all particulars x for which ‘x is A’ is true

Page 40: How to Build an Ontology

48

Problem

The same general term can be used to refer both to universals and to collections of particulars. Consider:

HIV is an infectious retrovirus

HIV is spreading very rapidly through Asia

Page 41: How to Build an Ontology

49

universals vs. classes

universals

{c,d,e,...} classes

Page 42: How to Build an Ontology

50

Extension =def

The extension of a universal A is the class: instance of the universal A

(it is the class of A’s instances)

(the class of all entities to which the term ‘A’ applies)

Page 43: How to Build an Ontology

51

universals vs. classes

universals

defined classes

Page 44: How to Build an Ontology

52

universals vs. classes

universals

populations, ...

Page 45: How to Build an Ontology

53

Defined class =def

a class defined by a general term which does not designate a universal

the class of all diabetic patients in Leipzig on 4 June 1952

Page 46: How to Build an Ontology

54

OWL is a good representation of defined classes

• sibling of Finnish spy

• member of Abba aged > 50 years

Page 47: How to Build an Ontology

55

Terminology =def.

a representational artifact whose representational units are natural language terms (with IDs, synonyms, comments, etc.) which are intended to designate universals together with defined classes.

Page 48: How to Build an Ontology

56

universals, classes, concepts

universals

defined classes

‘concepts’

Page 49: How to Build an Ontology

57

universals < defined classes < ‘concepts’

‘concepts’ which do not correspond to defined classes:

‘Surgical or other procedure not carried out because of patient's decision’

‘Absent nipple’

Page 50: How to Build an Ontology

58

(Scientific) Ontology =def.

a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent

1. universals in reality

2. those relations between these universals which obtain universally (= for all instances)

lung is_a anatomical structure

lobe of lung part_of lung

Page 51: How to Build an Ontology

59

Part II

How to Build an Ontology

Page 52: How to Build an Ontology

60

How to build an ontology

work with scientists to create an initial top-level classification

find ~50 most commonly used terms corresponding to universals in reality

arrange these terms into an informal is_a hierarchy according to this Universality principle

A is_a B every instance of A is an instance of B

fill in missing terms to give a complete hierarchy

(leave it to domain scientists to populate the lower levels of the hierarchy)

Page 53: How to Build an Ontology

61

Principle of Low Hanging Fruit

Include even absolutely trivial assertions (assertions you know to be universally true)

pneumococcal virus is_a virus

Computers need to be led by the hand

Page 54: How to Build an Ontology

62

MeSHMeSH Descriptors

Index Medicus Descriptor Anthropology, Education, Sociology and Social Phenomena (MeSH Category) Social Sciences Political Systems National Socialism

National Socialism is_a Political SystemsNational Socialism is_a Anthropology ...

Page 55: How to Build an Ontology

63

Principle

Use singular nouns

Terms in ontologies represent universals

Page 56: How to Build an Ontology

64

Goal: Each term in an ontology represents exactly one universal

there are universals also of collectivities:

population

complex of cells

Page 57: How to Build an Ontology

65

the use-mention confusion

Conceptual Entities =Def.

An organizational header for concepts representing mostly abstract entities.

swimming is healthy and has eight letters

Page 58: How to Build an Ontology

66

Principle

Avoid confusing between words and things

Avoid confusing between concepts in our minds and entities in reality

Recommendation: avoid the word ‘concept’ entirely

Page 59: How to Build an Ontology

67

Trialbank

‘information’ = def. ‘a  written or spoken designation of a concept’

Page 60: How to Build an Ontology

68

‘Heparin therapy’ is an instance of ‘written or spoken designation of a concept’

What are the problems here?

1. misuse of quotation marks

2. confusion of instances and universals

3. confusion of concept and reality

Trialbank

Page 61: How to Build an Ontology

69

Plant Ontology

cell = def. plant cell, consisting of protoplast and cell wall; ...

Page 62: How to Build an Ontology

70

Principle

For the sake of interoperability with other ontologies, do not give special meanings to terms with established general meanings

(Don’t use ‘cell’ when you mean ‘plant cell’)

Page 63: How to Build an Ontology

71

ICNP: International Classification of Nursing Procedures

water =def. a type of Nursing Phenomenon of Physical Environment with the specific characteristics: clear liquid compound of hydrogen and oxygen that is essential for most plant and animal life influencing life and development of human beings.

Page 64: How to Build an Ontology

72

Principle

Supply definitions wherever possible

(both human-understandable natural language definitions, and equivalent formal definitions)

Page 65: How to Build an Ontology

73

Principle

Each term should have at most one definition*

*which may have both natural-language and formal versions

Page 66: How to Build an Ontology

74

The Problem of Circularity

A Person = def. A person with an identity document

cell = def. plant cell, consisting of protoplast and cell wall; ...

Page 67: How to Build an Ontology

75

Principle

Avoid circular definitions

(The term defined should not appear in its own definition)

Page 68: How to Build an Ontology

76

HL7

‘stopping a medication’ = def.

change of state in the record of a Substance Administration Act from Active to Aborted

Page 69: How to Build an Ontology

77

Principle

A definition should use terms which are easier to understand than the term defined

(HL7 creates a topsy turvy world, in which simple things are made difficult)

Page 70: How to Build an Ontology

78

Principle

Use Aristotelian definitions

An A is a B which C’s.

Page 71: How to Build an Ontology

79

Principle

Do not seek to define everything

Page 72: How to Build an Ontology

80

In every ontology

some terms and some relations are primitive = they cannot be defined (on pain of infinite regress)

Examples of primitive relations:

identity

instance_of

Page 73: How to Build an Ontology

83

Rules for formatting terms• Avoid abbreviations even when it is clear

in context what they mean (‘breast’ for ‘breast tumor’)

• Avoid acronyms

• Avoid mass terms (‘tissue’, ‘brain mapping’, ‘clinical research’ ...)

• Treat each term ‘A’ in an ontology is shorthand for a term of the form ‘the universal A’

Page 74: How to Build an Ontology

84

Univocity Terms should have the same meanings on

every occasion of use.

(They should refer to the same universals)

Basic ontological relations such as is_a and part_of should be used in the same way by all ontologies

Page 75: How to Build an Ontology

85

Universality

Ontologies should include only those relational assertions which hold universally

pneumococcal virus causes pneumonia

Page 76: How to Build an Ontology

86

Universality

Often, order will matter:

We can assert

adult transformation_of child

but not

child transforms_into adult

Page 77: How to Build an Ontology

87

Universality

viral pneumonia caused by virus

but not

virus causes pneumonia

pneumococcal virus causes pneumonia

Page 78: How to Build an Ontology

88

Universality

protocol-design earlier_than results analysis

but not

results analysis later_than protocol-design

Page 79: How to Build an Ontology

89

Positivity

Complements of universals are not themselves universals.

Terms such as non-mammal non-membrane other metalworker in New Zealand

do not designate universals in reality

Page 80: How to Build an Ontology

90

Ontology of universals logic of terms

There are no conjunctive and disjunctive universals:

anatomic structure, system, or substance

musculoskeletal and connective tissue disorder

rheumatism, excluding the back

Page 81: How to Build an Ontology

91

Objectivity

Which universals exist in reality is not a function of our knowledge.

Terms such as

unknown

unclassified

unlocalized

arthropathies not otherwise specified

do not designate universals in reality.

Page 82: How to Build an Ontology

92

Keep Epistemology Separate from Ontology

If you want to say that

We do not know where A’s are located

do not invent a new class of

A’s with unknown locations

(A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge)

Page 83: How to Build an Ontology

93

If you want to say

I surmise that this is a case of pneumonia

do not invent a new class of surmised pneumonias

Keep Sentences Separate from Terms

Page 84: How to Build an Ontology

94

Single Inheritance

No kind in a classificatory hierarchy should have more than one is_a parent on the immediate higher level

Page 85: How to Build an Ontology

95

Multiple Inheritance

thing

carblue thing

blue car

is_a is_a

Page 86: How to Build an Ontology

96

Multiple Inheritance

is a source of errors

encourages laziness

serves as obstacle to integration with neighboring ontologies

hampers use of Aristotelian methodology for defining terms

hampers use of statistical search tools

Page 87: How to Build an Ontology

97

Multiple Inheritance

thing

carblue thing

blue car

is_a1 is_a2

Page 88: How to Build an Ontology

98

is_a Overloading

The success of ontology alignment demands that ontological relations (is_a, part_of, ...) have the same meanings in the different ontologies to be aligned.

Page 89: How to Build an Ontology

99

Compositionality

The meanings of compound terms should be determined

1. by the meanings of component terms

together with

2. the rules governing syntax

Page 90: How to Build an Ontology

100

Why do we need rules/standards for good ontology?

Ontologies must be intelligible both to humans (for annotation and curation) and to machines (for reasoning and error-checking): the lack of rules for classification leads to human error and blocks automatic reasoning and error-checking

Intuitive rules facilitate training of curators and annotators

Common rules allow alignment with other ontologies

Page 91: How to Build an Ontology

ontologies are legends for cartoons

Page 92: How to Build an Ontology

102

Randomized controlled trialshttp://rctbank.ucsf.edu/ontology/outline/index.htm

Page 93: How to Build an Ontology

103

Top-Level Class Hierarchy for RCT

Root Secondary-study

Trial-details

Trial

Concept • Generic-concept • Population-concept • Protocol-concept • Design-concept • Outcome-concept • Administrative-concept • Intervention-concept

Page 94: How to Build an Ontology

104

Trial DetailsRoot

Secondary-study Trial-details

• Erratum • Publication-details • Trial-entry-details • Administrative-details

– Secondary-administrative-details – Primary-administrative-details

» Executed-administrative-details » Intended-administrative-details

• Conclusion-details • Background-details

– Intended-background-details – Executed-background-details

• Stopping-details • Retraction-details • Correction-details • Fraud-details

Page 95: How to Build an Ontology

105

Top-Most Class Hierarchy for RCT

Root Secondary-study

Trial-details

Trial

Concept • Generic-concept • Population-concept • Protocol-concept • Design-concept • Outcome-concept • Administrative-concept • Intervention-concept

Page 96: How to Build an Ontology

106

Concept • Generic-concept

– Term-information – Time-entity – Rule-concept – Situation

• Population-concept – Subgroup – Recruitment-flowchart – Population – Recruitment – Site-enrollment

• Protocol-concept – Follow-up-compliance – Follow-up-activity – Follow-up – Protocol-change – Treatment-assignment – Protocol – Reason – Outcomes-followup – Secondary-study-protocol

Page 97: How to Build an Ontology

107

Concept • Design-concept

– Survival-analysis-and-results – Statistical-analysis-and-results – Sample-size-calculation – Trial-design – Hypothesis-concept – Study-objective – Study-monitoring – Regression-analysis-and-results – Stopping-rule

• Outcome-concept – Special-variable-information – Outcome-assessment – Miscellaneous-outcome-entity – Result-entity – Outcome-value-entity – Outcome

Page 98: How to Build an Ontology

108

Concept • Administrative-concept

– Publication-concept – Study-site – Person – Ethics – Study-committee – Funder – Institution – Registry-id

• Intervention-concept – Blinding-concept – Compliance-details – Intervention-step – Intervention-arm – Co-intervention – Intervention – Compliance-result – Intervention-logic

Page 99: How to Build an Ontology

109

Top-Level Class Hierarchy for RCT

Root Secondary-study

Trial-details

Trial

Concept • Generic-concept • Population-concept • Protocol-concept • Design-concept • Outcome-concept • Administrative-concept • Intervention-concept

Page 100: How to Build an Ontology

110

What the top level should look like

Page 101: How to Build an Ontology

111

Two kinds of entities

occurrents (processes, events, happenings)

continuants (objects, qualities, states...)

Page 102: How to Build an Ontology

112

Continuants (aka endurants)have continuous existence in timepreserve their identity through changeexist in toto whenever they exist at all

Occurrents (aka processes)have temporal partsunfold themselves in successive phasesexist only in their phases

Page 103: How to Build an Ontology

113

You are a continuant

Your life is an occurrent

You are 3-dimensional

Your life is 4-dimensional

Page 104: How to Build an Ontology

114

Dependent entities

require independent continuants as their bearers

There is no run without a runner

There is no grin without a cat

Page 105: How to Build an Ontology

115

Dependent vs. independent continuants

Independent continuants (organisms, buildings, environments)

Dependent continuants (quality, shape, role, propensity, function, status, power, right)

Page 106: How to Build an Ontology

116

All occurrents are dependent entities

They are dependent on those independent continuants which are their participants (agents, patients, media ...)

Page 107: How to Build an Ontology

117

BFO Top-Level Ontology

ContinuantOccurrent

(always dependent on one or more

independent continuants)

IndependentContinuant

DependentContinuant

Page 108: How to Build an Ontology

118

= A representation of top-level types

Continuant Occurrent

IndependentContinuant

DependentContinuant

cell component

biological process

molecular function

Page 109: How to Build an Ontology

119

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Functioning

Side-Effect, Stochastic Process, ...

Function

Page 110: How to Build an Ontology

120

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Functioning Side-Effect, Stochastic Process, ...

Function

Page 111: How to Build an Ontology

121

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Quality Function Spatial Region

Functioning Side-Effect, Stochastic Process, ...

instances (in space and time)

Page 112: How to Build an Ontology

122

Page 113: How to Build an Ontology

123

Page 114: How to Build an Ontology

124

CTO will be part of OBI

Ontology of Biomedical Investigations

http://obi.sourceforge.net

which is in turn part of the OBO Foundry

http://obofoundry.org

Page 115: How to Build an Ontology

125

Page 116: How to Build an Ontology

126

Page 117: How to Build an Ontology

127

Page 118: How to Build an Ontology

128

Page 119: How to Build an Ontology

129

Page 120: How to Build an Ontology

132

Top-Level Class Hierarchy for RCT

Root Secondary-study

Trial-details

Trial

Concept • Generic-concept • Population-concept • Protocol-concept • Design-concept • Outcome-concept • Administrative-concept • Intervention-concept

Page 121: How to Build an Ontology

133

Amended Top-Level Class Hierarchy for RCT

EntityContinuant

PopulationProtocolDesign

OccurrentTrial

Secondary-study Intervention

?? Trial-details ?? Outcome-concept ?? Administrative-concept

Page 122: How to Build an Ontology

134

Concept • Generic-concept

– Term-information – Time-entity – Rule-concept

» Clinical-rule

Exclusion-rule

Inclusion-rule » Rule-entity

Recursive-rule

Base-rule » Ethnicity-language-rule » Age-gender-rule » Situation

Page 123: How to Build an Ontology

135

Page 124: How to Build an Ontology

136

Page 125: How to Build an Ontology

137

Concept • Protocol-concept

– Follow-up-compliance – Follow-up-activity – Follow-up – Protocol-change – Treatment-assignment – Protocol – Reason – Outcomes-followup – Secondary-study-protocol

Page 126: How to Build an Ontology

138

Amended Top-Level Class Hierarchy for RCT

EntityContinuant

Protocol• Secondary-study-protocol

Reason Occurrent

• Treatment-assignment • Follow-up

– Follow-up-activity – Outcomes-follow-up

• Protocol-change

Page 127: How to Build an Ontology

139

Concept • Population-concept

– Subgroup – Recruitment-flowchart – Population – Recruitment – Site-enrollment

Page 128: How to Build an Ontology

140

Amended Top-Level Class Hierarchy for RCT

EntityContinuant

Protocol• Secondary-study-protocol

Recruitment-flowchart Reason Population

• Subgroup

Occurrent• Priors

– Recruitment– Site-enrollment – Treatment-assignment

• Follow-up – Follow-up-activity – Outcomes-follow-up

• Protocol-change

Page 129: How to Build an Ontology

141

Concept • Administrative-concept

– Publication-concept – Study-site – Person – Ethics – Study-committee – Funder – Institution – Registry-id

Page 130: How to Build an Ontology

142

Continuant• Information object

– Publication – Registry-ID

• Study-site • Person • Institution

– Study-committee – Funder

???Ethics

Page 131: How to Build an Ontology

143

Concept • Intervention-concept

– Blinding-concept – Compliance-details – Intervention-step – Intervention-arm – Co-intervention – Intervention – Compliance-result – Intervention-logic

Page 132: How to Build an Ontology

144

Occurrent• Intervention

– Blinding– Intervention-step – Intervention-arm – Co-intervention

• ??? Intervention-logic

• ??? Compliance-result

• ??? Compliance-details

Page 133: How to Build an Ontology

167

END