uberon lausanne-2012

50
Uberon – a multi-species ontology for phenomics and evo-devo analyses Chris Mungall, LBNL Melissa Haendel, OHSU Lausanne Feb 2012

Upload: chris-mungall

Post on 11-May-2015

837 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Uberon lausanne-2012

Uberon – a multi-species ontology for phenomics and evo-devo analyses

Chris Mungall, LBNLMelissa Haendel, OHSU

Lausanne Feb 2012

Page 2: Uberon lausanne-2012

Outline

• Introduction to Bio-Ontologies– Ontologies for data analysis and data integration– Anatomy ontology re-use vs variation in nature

• Uberon– Integration with species anatomy ontologies– Interoperation with non-anatomy ontologies– Reasoning and validation– Handling taxonomic variation– Applications

• Homology• Conclusions

Page 3: Uberon lausanne-2012

lung

thoracic cavity organ

thoracic cavity

lungalveolus

organ system

respiratory system

lung bud

respiratory primordium

develops_frompart_of

is_a (SubClassOf)

Ontologies abstract over repeated patterns in nature

Page 4: Uberon lausanne-2012

lung

thoracic cavity organ

thoracic cavity

lungalveolus

organ system

respiratory system

lung bud

respiratory primordium

develops_frompart_of

is_a (SubClassOf)

Logical semantics: the difference between ontologies and graphs

x instance-of lungx instance-of ‘thoracic cavity organ’

Page 5: Uberon lausanne-2012

lung

thoracic cavity organ

thoracic cavity

lungalveolus

organ system

respiratory system

lung bud

respiratory primordium

develops_frompart_of

is_a (SubClassOf)

Logical semantics: the difference between ontologies and graphs

x instance-of lungexists y:y instance-of ‘lung bud’x develops-from y

Page 6: Uberon lausanne-2012

lung

thoracic cavity organ

thoracic cavity

lungalveolus

organ system

respiratory system

lung bud

respiratory primordium

develops_frompart_of

is_a (SubClassOf)

Formal semantics allows for more precise queries

x expressed in y &y part of zx expressed in z

expressed in

Plunc

x expressed ubiquitously in y &y part of zx expressed ubiquitously in z ✗

(inferred)

Page 7: Uberon lausanne-2012

Ontology Languages

• Web Ontology Language (OWL)– Standard set of logical constructs for building an

ontology– Many syntaxes

• OWL-RDF/XML• OWL-XML• Manchester

– Many reasoners• OBO-Format

– Current formalized by mapping to a subset of OWL• can be treated as another OWL syntax

Page 8: Uberon lausanne-2012

chemical entities

Many perspectives, many ontologies

grossanatomy

tissues

cells cellanatomy

proteins

phenotypes

clinical disorders

processes

physiological processes

development

reactions

cellular processes

behavior

evolutionary characters

nervous system

Page 9: Uberon lausanne-2012

lung

lung

respiratory gaseous exchange

lobular organ

parenchymatous organ

solid organ

pleural sac

thoracic cavity organ

thoracic cavity

multicellular organismal process

abnormal lung morphology

abnormal respiratory system morphology

GO

MPO

MA

FMA

abnormal pulmonary acinus morphology

abnormal pulmonary alveolus morphology

lungalveolus

respiratory system process

organ system

respiratory system

Lower respiratory

tract

alveolar sac

pulmonary acinus

organ system

respiratory system

EHDAA2

lung

lung bud

respiratory primordium

pharyngeal region

develops_frompart_of

is_a (SubClassOf)

surrounded_by

The problem: Data Silos

Page 10: Uberon lausanne-2012

The OBO Foundry

• Avoid silo-ization via ontologies that are– open– documented– reusable– interoperable– built according to shared principles– reuse core relations and patterns

• Problem:– How do we re-use in the presence of variability?

http://obofoundry.org

Page 11: Uberon lausanne-2012

Ontologies built for one species will not work for others

http://fme.biostr.washington.edu:8080/FME/index.html

http://ccm.ucdavis.edu/bcancercd/22/mouse_figure.html

Page 12: Uberon lausanne-2012

Generalization leads to complexity

erythrocyte

nucleate erythrocyte

enucleate erythrocyte

cell

nucleate cell enucleate cell

Variables:V : Variability of entities in domainP : Logical Precision of queries

TP/(TP+FP*c)L : “Latticeyness” of class hierarchy

‘exception hierarchy’Hypothesis:

L = kPV

human erythrocyte

zebrafish erythrocyte

Page 13: Uberon lausanne-2012

Anatomy Ontology Menagerie• Mouse:

– MA (adult)– EMAP / EMAPA (embryonic)

• Human– FMA (adult)– EHDAA2 (CS1-CS20)

• Amphibian– AAO– XAO

• Fish– ZFA– TAO

• Nematode– WBbt

• Arthropod– FBbt (Drosophila)– HAO– Arthropod anatomy ontology

Reduced taxonomic scope =Reduced complexity

Contrast to: Gene Ontology (GO)(all kingdoms of life)

Historically littlecoordination

Page 14: Uberon lausanne-2012

Sept 2011

Page 15: Uberon lausanne-2012

Semantic Similarity of Phenotypes

"Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation." PLoS Biol 7(11): e1000247. doi:10.1371/journal.pbio.1000247 Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE

FMA+PATO MP ZFA+PATO FBbt+PATO

Page 16: Uberon lausanne-2012

The problem with mappingsClass A Class B In Bioportal? Useful?

FMA extensor retinaculum of wrist

MA retina Yes No

FMA portion of blood MA blood No Yes

ZFA Macula MA macula Yes No

ZFA aortic arch MA arch of aorta Yes Dubious

ZFA hypophysis MA pitiuitary No Yes

FMA tibia FBbt tibia Yes No

FMA colon GAZ Colón, Panama Yes No

Page 17: Uberon lausanne-2012

Our solution (2008-2009)• Create grouping classes for mappings

– Used our own software for entity matching– Manually split/merge in OboEdit using curator knowledge– Internal joke name: Uberon

• Used in phenotype analysis– Washington et al– http://owlsim.org

• We kept on tweaking– Used for GO logical definitions– Used in cell ontology– Used to clarify and align existing AOs– Integrated logic-based methods– We got criticized, we got better

• Fast forward to 2012…

Page 18: Uberon lausanne-2012

Uberon in 2012• Size:

– >6500 classes– >19000 relationships (50 relations)– >2000 logical definitions

• Scope:– Metazoa

• vertebrate bias, in particular mammals

• Availability– many versions, in obo and owl

• http://uberon.org

– Source version is obo, compiled to owl using Oort

• What does it look like?

Page 19: Uberon lausanne-2012

anatomical structure

endoderm of forgut

lung bud

lung

respiration organ

organ

foregut

alveolus

alveolus of lung

organ part

FMA:lung

MA:lung

endoderm

GO: respiratory gaseous exchange

MA:lung alveolus

FMA: pulmonary

alveolus

is_a (taxon equivalent)

develops_frompart_of

is_a (SubClassOf)

capable_of

NCBITaxon: Mammalia

EHDAA:lung bud

only_in_taxon

pulmonary acinus

alveolar sac

lung primordium

swim bladder

respiratory primordium

NCBITaxon:Actinopterygii

Uberon classes generalize species-specific ones, and connect to other ontologies via a variety of relations

Page 20: Uberon lausanne-2012

Inter-ontology bridging axioms

• Equivalence axioms:– lung (FMA) EquivalentTo lung (Ubr) and ‘part of’ some

NCBITaxon_9606– lung (MA) EquivalentTo lung (Ubr) and ‘part of’ some

NCBITaxon_10090• Subclass axioms:– lung (EMAPA) SubClassOf lung (Ubr)

• Axioms are maintained as xrefs– Translated to full axioms in obo2owl translation (header

tags)

Page 21: Uberon lausanne-2012

Import closure of

Uberon‘collector’ ontologies

Page 22: Uberon lausanne-2012

Different ontology modulesontology contents

basic simple relationships

uberon main ontology

merged main ontology + links to GO, CL, NCBITaxon, NBO

taxon collected merged basic

metazoan ✔ ✔vertebrate ✔ ✔amniote ✔aves ✔euarchontoglires ✔

Page 23: Uberon lausanne-2012

Collected vs Merged

somite(Ubr)

somite(ZFA)

somite 1(ZFA)

somite(Ubr)

[includes ZFA axioms as GCIs]

somite 1(ZFA)

Page 24: Uberon lausanne-2012

Logical definitions in GO using Uberon

GO:notochord formation: The formation of the notochord from the chordamesoderm. The notochord is composed of large cells packed within a firm connective tissue sheath and is found in all chordates at the ventral surface of the neural tube. In vertebrates, the notochord contributes to the vertebral column.

Cross-Product Extensions of the Gene Ontology Journal of Biomedical Informatics 2010. Christopher J. Mungall and Michael Bada and Tanya Z. Berardini and Jennifer Deegan and Amelia Ireland and Midori A. Harris and David P. Hill and Jane Lomax

Page 25: Uberon lausanne-2012

Uberon and phenotype ontologies

MA:blood vessel

UBERON: retinal blood vessel

MP:abnormal retinal blood vessel morphology

inheresin

is_a

MA: retina

HP: Central retinal artery vascular tortuosity

FMA:central retinal artery

inheresin

Page 26: Uberon lausanne-2012

Logical definitions in CL using Uberon

UBERON: trachea

CL: tracheal epithelial cell

CL: epithelial cell

is_apart_of

Uberon trachea: A trachea held open by up to 20 C-shaped rings of cartilage. The trachea is the portion of the airway that attaches to the bronchi as it branches.

Terrence Meehan, Anna Maria Masci, Amina Abdulla, Lindsay Cowell, Judith Blake, C J Mungall, Alexander Diehl (2011) Logical Development of the Cell Ontology, 6. In BMC Bioinformatics 12 (1)

UBERON: epithelium

part_of

Page 27: Uberon lausanne-2012

Uberon logical definitions represent functional, developmental, spatial, etc., axes of classification

Logical definitions in Uberon using external ontologies

UBERON: trachea

UBERON: respiratory airway

CL: tracheal epithelial cell

CL: epithelial cell

is_apart_of

is_a

UBERON: respiratory system

part_of

GO: respiratory gaseous exchangecapable_of

Page 28: Uberon lausanne-2012

J Deegan, E Dimmer, C J Mungall (2010) Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development, 530. In BMC bioinformatics 11 (1).

Uberon taxon constraints

UBERON: trachea

UBERON: respiratory airway

CL: tracheal epithelial cell

CL: epithelial cell

is_apart_of

is_a

Vertebrataonly_in_taxon

UBERON: respiratory system

part_of

GO: respiratory gaseous exchangecapable_of

Page 29: Uberon lausanne-2012

Axioms encodedin OWL provideexplicit semantics

Page 30: Uberon lausanne-2012

Text matchingStem and synonym

matching

Curationmanual adding of new classesobsoletion, merging, splitting

Reasoning• Keep axioms that are

consistent across AOs• automated

consistency checks for disjointeness violations

Uberon iterative development cycle

Page 31: Uberon lausanne-2012

Developmental Biology, Scott Gilbert, 6th ed.

Using reasoners to detect errors

Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’

UBERON: tibia

UBERON: bone

is_a

is_a

is_a

Vertebrata

Drosophila melanogaster

part_of

Homo sapiens

is_a

only_in_taxon

part_of

disjoint with

Page 32: Uberon lausanne-2012

Spatial disjointness axioms

• Example:– (part_of some midbrain) DisjointWith (part_of

some hindbrain)• Note: part_of implies all parts are part of

– Brain spatial axioms derived from ABA– Used to find problems in existing mouse

ontologies

Page 33: Uberon lausanne-2012

Differences in bone and bone tissue representation

Ontology alignment

Page 34: Uberon lausanne-2012

Using Uberon for alignment facilitates identification of missing classes

Ontology alignment

Page 35: Uberon lausanne-2012

Managing variation: named subtypes

• ‘mammary gland’ part of some ‘female thoracic region’– humans ✔– other mammals ✗

• Solution:– mammary gland

• thoracic mammary gland• abdominal mammary gland• inguinal mammary gland

Page 36: Uberon lausanne-2012

Managing variation: general axioms

• adenohypophysis develops from some ‘Rathke’s pouch’– tetrapoda✔– teleost ✗

• Named subtypes solution– ‘Rathke’s pouch-derived adenohypophysis’

• ugly!

• Alternative:– use anonymous classes / OWL general axioms:

• (adenohypohysis and part of some tetrapoda) develops from some ‘Rathke’s pouch’

the adenohypophysis has different developmental origins in different species - while in most basal fish and tetrapods the adenohypophyseal anlagen invaginates to form Rathke’s pouch, in teleost fish the adenohypophyseal placode does not invaginate but rather maintains its initial organization forming a solid structure in the head

Page 37: Uberon lausanne-2012

Pharyngeal derivatives

• Pharyngeal pouches 1-5– dorsal and ventral parts

• Give rise to different structures in different clades– E.g.

• parathyroid from ventral pouch 3 & 4 in many vertebrates

• in humans, from dorsal pouches 3 and 4– Kardong, Vertebrates

• All encoded in Uberon using general axioms

Page 38: Uberon lausanne-2012

A logic for developmental relationships

• Most AOs use a single generic develops from relationship– FBbt distinguishes between direct and transitive development– EHDAA2 includes ‘develops in’

• Different structures give different contributions– E.g. neural crest– Modeled explicitly in EHDAA2

• develops from relationships at very specific leaf nodes

• Relation composition– has_part o develops_from ->

has_developmental_contribution_from

Credit: Osumi-Sutherland, Haendel and Bard

Page 39: Uberon lausanne-2012

Provenance for relations[Term]id: UBERON:0005562name: thymus primordium…relationship: has_developmental_contribution_from UBERON:0010023 {gci_relation="part_of", gci_filler="NCBITaxon:7778", notes="Elasmobranchii", source="ISBN10:0073040584-table13.1"} ! dorsal part of pharyngeal pouch 2

OWL:(‘thymus primordium’ and part_of some NCBITaxon_7778) SubClassOf has_developmental_contribution_from ‘dorsal part of pharyngeal pouch 2’ Annotations: source "ISBN10:0073040584-table13.1"

Page 40: Uberon lausanne-2012

Use of Uberon enhances species-specific ontologies

• Many ontologies lack develops from relationships– mouse

• MA ✗• EMAPA ✗

– human• FMA ✗• EHDAA2 ✔• SNOMED-CT ✗

• These can be enhanced by the develops from relationships in uberon– E.g find all pharyngeal arch derivatives

• Combine with Bgee expression data for powerful queries– E.g compare gene expression patterns for pharyngeal arch derivatives

Page 41: Uberon lausanne-2012

Use of Uberon as building block for other ontologies

• Basic science– CL– GO– NBO (behavior)– Phenotype (MP, HP)

• Applications– OBI– eagle i

Page 42: Uberon lausanne-2012

Applications of Uberon in bioinformatics analyses

• Crucial lynchpin in a number of phenotype analyses– Washington, Haendel et al– Mousefinder– Phenomenet

• Expression analyses– FANTOM5

Page 43: Uberon lausanne-2012

Uberon and homology• Uberon classes do not need to be homologous• We try to state necessary and sufficient conditions for all classes

– Genus: parent class– Differentia may be any mix of:

• Locational• Histological• Structural• Functional• Developmental• Or homology!

• This is essentially essentialist– ‘essentialist’ may make evo-devo folks uncomfortable, but it’s how

most ontologies work

Page 44: Uberon lausanne-2012

Eyes

• Eye: organ and has function in go:visual perception– Compound eye: has part ommatidia– Camera-type eye : equivalent to vHOG eye• vertebrate-type*• cephalod-type*

*Not yet in ontology

Page 45: Uberon lausanne-2012

adrenal gland – interrenal gland

• Single class in vHOG• Distinct classes in Uberon– Score highly on semantic similarity measures do to

has_part relationships to cell types• Homology can be handled separately• Open question:– interrenal gland vs bodies?– Homology at the level of gland or cortex?

Page 46: Uberon lausanne-2012

Using Uberon and vHOG together

• UberHOG?

Page 47: Uberon lausanne-2012

Using Uberon and vHOG together

• vHUG?

Page 48: Uberon lausanne-2012

Proposal

• Separation of concerns– essentialist definitions– homology relationships

• Create ‘homology knowledgebase’– Statements anchored to Uberon classes

• E.g– lung (Ubr) has property: homologous, has_evidence …– head kidney + bone marrow, has property: homologous, has_evidence …

– Use homology ontology– Contributions from vHOG and Phenoscape

• Automatically aggregate for powerful queries

Page 49: Uberon lausanne-2012

Conclusions

Anatomy ontologies have been developed independently and do not integrate well without additional help

•Uberon generalizes over species-specific anatomy classes

• Includes detailed anatomical knowledge via a variety of relationships

• designed for reasoning

• Highly interconnected with other ontologies

• Homology is largely separated

• Growing number of applications

•For more info:

• http://uberon.org

http://genomebiology.com/2012/13/1/R5

Page 50: Uberon lausanne-2012

Acknowledgments• Uberon• Melissa Haendel• George Gkoutos• Carlo Torniai• Suzanna Lewis

• Ontologies• Jonathan Bard (EHDAA2)• Terry Meehan (CL)• Alex Diehl (CL)• Terry Hayamizu (MA/CL)• Onard Mejino (FMA)• David Hill (GO)• David Osumi Sutherland (FBbt/CARO)• Paul Schofield (MPATH)• Wasilla Dahdul (TAO/VAO)• Paula Mabee (TAO/VAO)• Erik Segerdell (XAO)• Monte Westerfield (ZFA)• Cynthia Smith (MP)• Maryanne Martone (NIF)• Frederic Bastian (vHOG)• Marc Robinson-Rechavi (vHOG)

• Contributions• Alan Ruttenberg• Rob Hoehndorf• Wacek Kusnierczyk• Harry Hochheiser