introduction to anatomy ontology building

56
+ Introduction to anatomy ontology building David Osumi-Sutherland FlyBase (www.flybase.org ) Virtual Fly Brain (www.virtualflybrain.org )

Upload: bao

Post on 23-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Introduction to anatomy ontology building. David Osumi -Sutherland FlyBase ( www.flybase.org ) Virtual Fly Brain ( www.virtualflybrain.org ). Take home messages. An ontology is a classification There are lots of useful ways to classify stuff - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to anatomy ontology building

+

Introduction to anatomy ontology buildingDavid Osumi-SutherlandFlyBase (www.flybase.org)Virtual Fly Brain (www.virtualflybrain.org)

Page 2: Introduction to anatomy ontology building

+Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is

impractical So you should automate it.

Everybody makes mistakes So you should get the computer find errors for you

Re-use other people’s work where possible import class hierarchies use common patterns

Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

Page 3: Introduction to anatomy ontology building

+What is an ontology ?

A set of defined, inter-related terms to use in annotation/metadata/knowledge bases.

A classification

A query-able store of (scientific) knowledge that uses logical inference.

Page 4: Introduction to anatomy ontology building

+What is an ontology ?

A set of defined, inter-related terms to use in annotation/metadata/knowledge bases.

A classification

A query-able store of (scientific) knowledge that uses logical inference.

depends on

depends on

depends on

Page 5: Introduction to anatomy ontology building

+What (use) is an ontology?

A set of defined, inter-related terms to use in annotation. Annotation of

papers; specimens; gene expression; phenotype… Use of common annotation terms across multiple

databases allows easy shared integration. Relations between terms allow annotations to be

grouped in scientifically meaningful ways requires an ontology to be an accurate and

scientifically meaningful classification and store of scientific knowledge.

Page 6: Introduction to anatomy ontology building

+What is an ontology ?

A classification There are lots of scientifically useful ways to

classify a bit of anatomy. its parts and their arrangement its relation to other structures

what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history?

Page 7: Introduction to anatomy ontology building

+What is an ontology ?

The scientific knowledge an ontology contains can make the reasons for classification explicit. e.g.

Any sense organ that functions in the detection of smell is an olfactory sense organ

All large basiconic sensilla of the antenna function in detection of smell

Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

Page 8: Introduction to anatomy ontology building

+Virtual Fly Brain Demo

Page 9: Introduction to anatomy ontology building

+Why ontology development is like software or database development Ideal case –

maintainable basic maintenance (e.g. correcting simple errors) is

easy scalable

grow your project as large as you need without breaking

extensible easy to add new functionality without breaking existing

integrate-able Can integrate easily with work of others – so you don’t

have to solve all problems yourself

Page 10: Introduction to anatomy ontology building

+Why ontology development is like software or database development Ideal case – Future editors can build on your work

maintainable – By multiple editors basic maintenance (e.g. correcting simple errors) is easy

scalable – By multiple editors grow your project as large as you need without breaking

extensible – By multiple editors easy to add new functionality without breaking existing

integrate-able Can integrate easily with work of others – so you don’t

have to solve all problems yourself

Page 11: Introduction to anatomy ontology building

+How not to build ontologies- The trap

A small, simple ontology or program with one developer can get away with practices that a large one can not given

shallow, single inheritance classification (each class has 0-1 superclasses)

very few relationship types < 1000 terms.

it is feasible to: have little annotation/documentation have no automated error checking have no automated classification keep redundancy to a minimum by hand

Page 12: Introduction to anatomy ontology building

+How not to build ontologies- The trap

Small, simple ontologies and programs have a habit of growing large and complicated. Users demand lots more terms for annotation Users demand multiple axes of classification

No scientific reason to favor one over another Users demand/editors favor multiple relationship

types to record information they believe scientifically important.

Editors/coders move on someone else has to continue their work. Is the

documentation mainly in the old developers head?

Page 13: Introduction to anatomy ontology building

+How not to build ontologies- The trap

Worst case scenario – the tangled pit of misery: Difficult, perhaps impossible to maintain or extend

Tangled, convoluted, redundant structure with little or no documentation or annotation.

Editing tends to inadvertently break previous functionality.

Little or no error checking means you don't even notice when you break stuff. Users find out later.

Even you can't easily edit what you built 6 months ago without getting confused and making a mess.

Page 14: Introduction to anatomy ontology building

+Avoiding tangled pits of misery

There are no perfect answers, but these might help: good annotation and documentation; good, consistent style; avoidance of redundancy; let the computer keep track of things for you modularity; automate a consistent set of tests of existing functionality (j-unit /

consistency); constant testing during development; design patterns.

Page 15: Introduction to anatomy ontology building

+Good Practice 1:Good annotation and documentation

Clear textual definitions with references ensure accurate manual annotation make assertions of scientific fact trace-able serve as documentation for future ontology

developers

Also useful to record – for users and future developers: Experimental evidence for assertions of scientific fact Notes on confusing or conflicting usage of terms Reasons for design choices/compromises

Page 16: Introduction to anatomy ontology building

+Options for formalization

OWL W3C standard Decidable Big open source community of tool developers multiple fast reasoners – getting better all the time Easy to read syntax – OWL Manchester syntax (OWL MS)

OBO Best thought of as a subset of OWL, with which it is

increasingly integrated Limited community of tool developers Easy(ish) to read syntax

Common logic Very powerful. But easy to come up with solutions that can’t

be usefully reasoned with.

Page 17: Introduction to anatomy ontology building

+Relationships are the formalized part of a definition.

The criteria for class membership is recorded using textual definitions, at least some elements of which are formalized as relationships. name: insect wing def: “A membranous dorsal

appendage or the meso- or metathorax that functions in flight .” [Snodgrass, 1935]

is_a: appendage relationship: part_of thoracic

segment relationship: has_function_in flight

Page 18: Introduction to anatomy ontology building

+Classification is transitive

If A SubClass* of B and B SubClassOf C then A SubClassOf C All members of class A are members of class C. So, the

definition of class C must apply to class A.

* OWL (MS) SubClassOf ≅ OBO is_a

Page 19: Introduction to anatomy ontology building

+Classification is transitive ‘material anatomical entity’

<- is_a ‘sense organ’ <- is_a sensillum

<- is_a ‘olfactory sensillum’<- is_a ‘antennal basiconic

sensillum’

‘material anatomical entity’: “… has mass.” ‘sense organ’: “… functions in the detection of a stimulus

involved in sensory perception.” sensillum: “A sense organ consisting of a small cluster of

cells of various types.” ‘olfactory sensillum’: “… functions in the detection of

smell”

* OWL (MS) SubClassOf ≅ OBO is_a

Page 20: Introduction to anatomy ontology building

+class – class relationships are quantified Class:Class relationships are many to many

Does the relation apply to all or just some of the class ? we specify this with quantifiers:

∀: for all, all, only, every ∃: there exists, some

Cautionary note – Modeling knowledge as class hierarchies defined with

quantified logic is an extremely useful but is limited. Don’t expect to be able to use if for everything you know! Expressivity of OWL is more limited still.

Page 21: Introduction to anatomy ontology building

+relationships specify necessary conditions for class membership Being part of an insect thorax is a

necessary condition of being in the class ‘insect leg’. English:

All insect legs are part of some (type of) insect thorax

OBO (quantifiers hidden) name: insect leg relationship: part_of thorax

OWL (MS): ‘insect wing’ SubClassOf part_of

some thorax PL:

∀leg(x), ∃thorax(y) and part_of(x,y) *

* ignoring time argument from OBO RO 2005

Page 22: Introduction to anatomy ontology building

+Classification is transitive If A SubClass* of B and B SubClassOf C then A

SubClassOf C All members of class A are members of class C. So, the

definition of class C must apply to class A.

* OWL (MS) SubClassOf ≅ OBO is_a

(all) leg part_of some thorax

‘front leg’ SubClassOf leg

therefore (all) ‘front leg’part_of some thorax

Page 23: Introduction to anatomy ontology building

+Directionality and quantifiers

True: all ‘insect wing’ part_of some ‘insect thorax’

False: all ‘insect thorax’ has_part some ‘insect wing’

True: all ‘claw’ connected_to some ‘tarsal segment’

False: all ‘tarsal segment’ connected_to some claw

Page 24: Introduction to anatomy ontology building

+

• It is difficult to keep track of multiple classification chains to: • ensure completeness;• avoid redundancy;• avoid introducing error

due to inheritance of classification criteria from a distant ancestor

Manually maintaining an ontology with multiple

classification schemes is impractical

Page 25: Introduction to anatomy ontology building

+Automating multiple classification. The scientific knowledge an ontology contains

can make the reasons for classification explicit. e.g.

Any sense organ that functions in the detection of smell is an olfactory sense organ

All large basiconic sensilla of the antenna function in detection of smell

Therefore all large basiconic sensilla of the antenna are are olfactory sense organs

Page 26: Introduction to anatomy ontology building

+Automating multiple classification. We can specify that some set of necessary conditions for class

membership are sufficient to determine class membership English

Any sense organ that functions in the detection of smell is an olfactory sense organ

OWL (MS): olfactory sense organ’ EquivalentTo: sense organ that

has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’

OBO name: olfactory sense organ intersection_of: sense organ intersection_of: has_function_in ‘detection of chemical

stimulus involved in sensory perception of smell’

Page 27: Introduction to anatomy ontology building

+Automating multiple classification.

‘olfactory sense organ’ EquivalentTo: sense organ that has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’

‘large basiconic sensillum of antenna’ SubClassOf: ‘sense organ’; SubClassOf has_function_in some ‘detection of chemical stimulus involved in sensory perception of smell’

Reasoner concludes: ‘large basiconic sensillum of antenna’ SubClassOf ‘olfactory sense organ’ Keene & Waddell, 2007

Page 28: Introduction to anatomy ontology building

+Use other people’s work to build your classification Gene Ontology classification of sensory processes:

Page 29: Introduction to anatomy ontology building

+Automating multiple classification.

Page 30: Introduction to anatomy ontology building

+Some extra OWL expressivity

In OWL we can also specify number (cardinality): (all) insect: SubClassOf

has_component exactly 6 leg

Page 31: Introduction to anatomy ontology building

+Error checking is essential – everybody makes mistakes

Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint

Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints.

This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

Page 32: Introduction to anatomy ontology building

+Error checking - domain and range constraints ‘cortisol secretion’ SubClassOf ‘endocrine hormone secretion’

SubClassOf process ‘adrenal gland’ SubClassOf ‘endocrine gland’ SubClassOf structure structure DisjointWith process (nothing can be both a

structure(adrenal gland) and a process (e.g. cortisol secretion) has_function_in

domain: structure* range: process*if x has_function_in y then x must be an object and y must be a process.

Now if I mistakenly add: cortisal secretion has_function_in some adrenal gland.

Inconsistency: cortisol secretion SubClassOf structure and process

* more strictly, structure= continuant; range = occurrent

Page 33: Introduction to anatomy ontology building

+Error checking is essential – everybody makes mistakes

Some classes don’t have instances in common. Nothing can be an oak tree and a fruit fly; an anatomical structure and a biological process. We say that such classes are disjoint

Declaring classes to be disjoint allows reasoners to find contradictions. This is especially powerful when combined with domain and range constraints.

This is your main means of error checking. Use it extensively. It also speeds up some reasoners.

Page 34: Introduction to anatomy ontology building

+Reasoner assisted error checking by eye

Keep an eye on classification inferred by the reasoner.

Protégé shows inferred classification and inherited relationships – keep an eye on these

Page 35: Introduction to anatomy ontology building

+Reasoner assisted error checking by eye

Run some test queries – do they give the answers you expect?

Page 36: Introduction to anatomy ontology building

+Mereologypart_of is transitive

If A part_of B part_of C part_of DThen A part_of D

overlap is not transitive. If A overlaps B overlaps C then A may or may not overlap C

B CD

A

A B CA

B

C

Page 37: Introduction to anatomy ontology building

+Transitivity of part_of

Given (All) ‘insect coxa’

part_of some ‘insect leg’ (All) ‘insect leg’ part_of

some ‘insect thoracic segment’

(All) ‘insect thoracic segment’ part_of some ‘insect thorax’

Then (All) ‘insect coxa’

part_of some ‘insect thorax’

Page 38: Introduction to anatomy ontology building

+Automating partonomy

As for class – maintaining multiple overlapping part hierarchies by hand is hard.

Some scope for auto-populating partonomies – e.g.- English

Any anatomical structure that functions in endocrine hormone secretion is part of some endocrine system

OWL (‘anatomical structure’ that has_function_in some ‘endocrine

hormone secretion’) SubClassOf (part_of some ‘endocrine system’) OBO

name: endocrine system component intersection_of: anatomical structure’ intersection_of: has_function_in ‘endocrine hormone secretion’ relationship: part_of endocrine system

Page 39: Introduction to anatomy ontology building

+Declaring spatial disjointness provides error checking for partonomy

In OWL: part_of some X DisjointWith part_of some Y

Page 40: Introduction to anatomy ontology building

+Reasoning with overlap

A overlaps B if and only ifthere exists some X and X part_of A and X part_of B

rules: If X part_of A then X overlaps AIf A has_part X then A overlaps A

overlaps. * part_of. * has_part

In OWL (MS) * = SubPropertyOfIn OBO *= is_a

A BX

A BX

Page 41: Introduction to anatomy ontology building

+Reasoning with overlap

More rules

If A has_part X and X part_of B then X overlaps BIf C has_part A and A overlaps B then C overlaps

B If B overlaps A and A part_of C then B overlaps C

In OWL (MS):has_part o part_of -> overlaps

In OBO:name: overlapsholds_over_chain: has_part part_of

A BX

A BX

A BX

C

Page 42: Introduction to anatomy ontology building

+

Image - Greg Jefferis

Keene & Waddell, 2007

Page 43: Introduction to anatomy ontology building

+Shortcut relations

In OWL, we can write compound class expressions: ‘antennal lobe projection

neuron’ has_part some (soma that part_of some ‘antennal lobe cortex’)

But these can quickly get long and verbose ‘‘DL1 adPN’ has_part some

(potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’)))

Page 44: Introduction to anatomy ontology building

+Shortcut relations Shortcut relations stand in for

compound class expressions. ‘DL1 adPN’ has_part some

(potsynaptic membrane (GO) that part_of some (synapse (GO) that part_of some ‘DL1 glomerulus’)))

> ‘DL1 adPN’

has_postsynaptic_terminal_in some ‘DL1 glomerulus’

Can be expanded if detail needed.

Provides rigorous documentation of meaning.

Page 45: Introduction to anatomy ontology building

+Where to start?

Make a flat list of the terms you need and list the types of classification you want to use to link them together.

Has someone already formalized this type of classification? If so, use their pattern. If not – draft some formalizations yourself:

Are any simplifications justifiable – or likely to be too misleading? DON’T FORMALIZE FOR THE SAKE OF IT! Some classifications are

hard to formalize well – or may be best left to human judgment. Import upper classifications and relations Import classifications to root for all foreign terms used. Work with ontologists to formally define relations where possible

But don’t let this become a road block!

Page 46: Introduction to anatomy ontology building

+Technical issues

Imports: Importing whole ontologies is easy in both

OBO and OWL But importing large ontologies is impractical

in both Generating simple slices of OBO ontologies

is easy (have perl scripts, happy to share) Generating slices of OWL ontologies – some

tools (Ontofox), but still need work.

Page 47: Introduction to anatomy ontology building

+Developing nested ontologies

CARO

VAO

Present TAO Modularized ontology

Page 48: Introduction to anatomy ontology building

+Resources

CARO – upper ontology new version being prepared out soon.

Some standard patterns using qualities

FUNCARO provides standard patterns for representing function using CARO

+ GO

ro.owl new home for OBO relations – particularly shortcut relations.

Imports fundamental relations from BFO (basic formal ontology)

Page 49: Introduction to anatomy ontology building

+

There are lots of scientifically useful ways to classify a bit of anatomy: parts and their arrangement - its relation to other structures

what is it: part of; connected to; adjacent to, overlapping?

its shape its function its developmental origins its species or clade its evolutionary history?

Multiple classification

Page 50: Introduction to anatomy ontology building

+type of classification

relation object of relation

what parts does it have?

has_parthas_component (for counts)

anatomical entity

what is it part of? part_of anatomical entityquality (e.g.- shape) has_quality PATO termfunction has_function_in

capable_of (?)GOperhaps behavior ontologies?

developmental origin develops_from anatomical entity

developmental fate develops_into anatomical entity

connectivity (e.g.- muscle/tendon to bone)

connected_to anatomical entity

evolutionary origin dervied_by_descent_from ?homolgous_to ?

anatomical entity

species/clade/taxon in_taxon ? species/clade/taxon

Page 51: Introduction to anatomy ontology building

+Avoiding tangled pits of misery

There are no perfect answers, but these might help: You do this my hand

good annotation and documentation; good, consistent style;

Automated classification and consistency checking gives: avoidance of redundancy computer keeps track of things for you automation a consistent set of tests of existing functionality (j-unit / consistency); constant testing during development;

Importing useful slices of other ontologies gives you: modularity;

Upper ontologies give you: design patterns

Page 52: Introduction to anatomy ontology building

+Take home messages An ontology is a classification There are lots of useful ways to classify stuff Maintaining multiple classification schemes by hand is impractical

So you should automate it. Everybody makes mistakes

Let the computer find errors for you Use the reasoner to test as you build

Re-use other people’s work where possible import class hierarchies use common patterns

Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!

Page 53: Introduction to anatomy ontology building

+Acknowledgments

Virtual Fly Brain - Michael Ashburner, Cahir O’Kane, Douglas Armstrong, Simon Reeve, Nestor Milyaev

FlyBase HAO – Andy Deans/Matt Yoder/Jim Balhoff Chris Mungall, LBL Berkeley Melissa Haendel, eagle-I Alan Ruttenberg, SUNY Buffalo Barry Smith, SUNY Buffalo Robert Stevens, (Co-ode; OWL-API) Manchester University BBSRC (grant award BB/G02247X/1)

Page 54: Introduction to anatomy ontology building

+

Page 55: Introduction to anatomy ontology building

+Drosophila anatomy ontology as an example Circa 2006: tangled pit of misery. 6500 term. 6%

definitions. Many of them not suitable (give example). Sufficient inconsistency that not reliable for grouping

terms / reasoning - give examples Sufficiently incomplete that most queries/groupings missed

very many terms - use mechanosensory bristle (or something similar) as

example. Editing a nightmare - unclear what original reasons were for

relationships. For any term - not clear what relations already inferred or how to

Page 56: Introduction to anatomy ontology building

+

2008 (0% inferred)

2011 (100% inferred)

sense organ 835 759chemosensory organ

14 96

gustatory organ 0 49

olfactory organ 0 37