principles of (biomedical) ontology design
DESCRIPTION
Principles of (Biomedical) Ontology Design. Barry Smith Department of Philosophy, University at Buffalo National Center for Biomedical Ontology (http://ncbo.us). A methodology for building and evaluating ontologies. applied thus far in the biomedical domain to: FMA - PowerPoint PPT PresentationTRANSCRIPT
11
Principles of Principles of (Biomedical) Ontology (Biomedical) Ontology
DesignDesign
Barry SmithBarry SmithDepartment of Philosophy, University at Department of Philosophy, University at
BuffaloBuffaloNational Center for Biomedical Ontology National Center for Biomedical Ontology
(http://ncbo.us)(http://ncbo.us)
22
A methodology for building A methodology for building and evaluating ontologiesand evaluating ontologies
applied thus far in the biomedical domain applied thus far in the biomedical domain to:to:– FMAFMA– GO + other OBO OntologiesGO + other OBO Ontologies– NCI ThesaurusNCI Thesaurus– UMLS Semantic NetworkUMLS Semantic Network– FuGOFuGO– SNOMEDSNOMED– ICF (International Classification of Functioning, ICF (International Classification of Functioning,
Disability and Health)Disability and Health)– BirnLex, RadioLex, NeuronamesBirnLex, RadioLex, Neuronames– ISO Terminology StandardsISO Terminology Standards– HL7-RIMHL7-RIM
33
Some ExamplesSome Examples
44
Foundational Model of Foundational Model of Anatomy Anatomy
(FMA)(FMA)ProPro
Clear statement of scope: Clear statement of scope: structural structural human human anatomy, at all levels of granularity, from the anatomy, at all levels of granularity, from the whole organism to the biological macromolecule. whole organism to the biological macromolecule. Powerful treatment of definitionsPowerful treatment of definitionsSingle inheritance Single inheritance is_a is_a hierarchyhierarchy
ConConSome unfortunate artifacts in the ontology Some unfortunate artifacts in the ontology deriving from its specific computer deriving from its specific computer representation (Protégé)representation (Protégé)
5
Pleural Cavity
Interlobar recess
Mesothelium of Pleura
Pleura(Wall of Sac)
VisceralPleura
Pleural Sac
Parietal Pleura
Anatomical Space
OrganCavity
Serous SacCavity
AnatomicalStructure
Organ
Serous Sac
MediastinalPleura
Tissue
Organ Part
Organ Subdivision
Organ Component
Organ CavitySubdivision
Serous SacCavity
Subdivision
part_
of
is_a
66
FMA FMA follows formal rules follows formal rules for Aristotelian definitionsfor Aristotelian definitions
When When A is_a B, A is_a B, the definition of ‘the definition of ‘A A ’ takes ’ takes the form:the form:
an A an A =Def=Def. a B which C s.... a B which C s...
a human being a human being =Def. =Def. an animal which is an animal which is rationalrational
77
ExamplesExamplesCellCell =Def. an =Def. an anatomical structure anatomical structure
which which consists ofconsists of cytoplasmcytoplasm surrounded bysurrounded by a a plasma plasma membranemembrane
88
The FMA regimentationThe FMA regimentationbrings the advantage that circular brings the advantage that circular
definitions are avoideddefinitions are avoidedeach definition reflects the position in the each definition reflects the position in the
hierarchy to which a defined term hierarchy to which a defined term belongs belongs
the position of a term within the hierarchy the position of a term within the hierarchy enriches its own definition by enriches its own definition by incorporating automatically the incorporating automatically the definitions of all the terms above it.definitions of all the terms above it.
99
The entire information content of the The entire information content of the FMA’s term hierarchy can be FMA’s term hierarchy can be translated very cleanly into a translated very cleanly into a computer representationcomputer representation
But the definitions encapsulate this But the definitions encapsulate this information in a modular form which information in a modular form which is of maximal advantage to human is of maximal advantage to human beingsbeings
The FMA regimentationThe FMA regimentation
1010
The FMA regimentation The FMA regimentation ensures intelligibility of ensures intelligibility of
definitionsdefinitionsThe terms used in a definition should The terms used in a definition should be simpler (more intelligible) than the be simpler (more intelligible) than the term to be defined; otherwise the term to be defined; otherwise the definition provides no assistance definition provides no assistance – to human understandingto human understanding– to machine processingto machine processing
1111
FMAFMAorganized in a graph-theoretical organized in a graph-theoretical structure involving two sorts of links or structure involving two sorts of links or edges: edges:
is-ais-a (= (= is a subtype of is a subtype of ))((pleural sac pleural sac is-a is-a serous sacserous sac) )
part-of part-of ((cervical vertebra cervical vertebra part-of part-of vertebral vertebral columncolumn))
12
Pleural Cavity
Interlobar recess
Mesothelium of Pleura
Pleura(Wall of Sac)
VisceralPleura
Pleural Sac
Parietal Pleura
Anatomical Space
OrganCavity
Serous SacCavity
AnatomicalStructure
Organ
Serous Sac
MediastinalPleura
Tissue
Organ Part
Organ Subdivision
Organ Component
Organ CavitySubdivision
Serous SacCavity
Subdivision
part_
of
is_a
1313
at every level of at every level of granularitygranularity
1414
The FMA is a Structural The FMA is a Structural AnatomyAnatomy
Plasma membranePlasma membrane =Def =Def. a. a cell part cell part that that surroundssurrounds the the cytoplasmcytoplasm
1515
The Gene OntologyThe Gene Ontology
ProProOpen SourceOpen SourceCross-SpeciesCross-SpeciesImpressive annotation resourceImpressive annotation resourceImpressive policies for maintenanceImpressive policies for maintenanceHas recognized the need for reformHas recognized the need for reform
1616
The Gene OntologyThe Gene Ontology
ConConPoor formal architecture (Mk I.) Poor formal architecture (Mk I.) Poor support for automatic reasoning Poor support for automatic reasoning
and error-checkingand error-checkingNo cross-ontology relationsNo cross-ontology relationsNot (yet) transgranularNot (yet) transgranular
1717
GO:0019836 hemolysis of red GO:0019836 hemolysis of red blood cellsblood cells
=Def. The processes by which an =Def. The processes by which an organism effects hemolysis ...organism effects hemolysis ...
XX = =Def. Def. the the Y Y of of XX
This sort of definition is worse than This sort of definition is worse than circularcircular
1818
Gene Ontology now adopting Gene Ontology now adopting structured definitions built out of structured definitions built out of
genusgenus and and differentiaedifferentiae
Species =Def Genus + Differentiae
neuron cell differentiation =Defdifferentiation by which a cell acquires features of a neuron
1919
National Cancer Institute National Cancer Institute Thesaurus (NCIT)Thesaurus (NCIT)
ProProNCIT is open sourceNCIT is open sourceNCIT has broad coverageNCIT has broad coverageNCIT has some formal structure (OWL-DL)NCIT has some formal structure (OWL-DL)NCIT has realized the errors of its waysNCIT has realized the errors of its ways
ConConFull of errors (many inherited from UMLS)Full of errors (many inherited from UMLS)Bad realization of formal structureBad realization of formal structure
2020
Goals of NCITGoals of NCIT
to make use of current terminology to make use of current terminology best practices to relate relevant best practices to relate relevant concepts to one another in a formal concepts to one another in a formal structure, e.g. to support automatic structure, e.g. to support automatic reasoning;reasoning;
2121
Formal DefinitionsFormal Definitionsof 37,261 nodes, 33,720 remain of 37,261 nodes, 33,720 remain formally undefinedformally undefinedThus only a small portion of the NCIT Thus only a small portion of the NCIT ontology can be used for purposes of ontology can be used for purposes of automatic classification and error-automatic classification and error-checkingchecking
2222
Verbal DefinitionsVerbal DefinitionsAbout half the NCIT terms are assigned About half the NCIT terms are assigned
verbal definitions for human useverbal definitions for human useUnfortunately some are assigned more Unfortunately some are assigned more
than onethan one
2323
Disease ProgressionDisease ProgressionDefinition1Definition1
Cancer that continues to grow or Cancer that continues to grow or spread. spread.
Definition2Definition2 Increase in the size of a tumor or Increase in the size of a tumor or spread of cancer in the body. spread of cancer in the body.
Definition3Definition3 The worsening of a disease over time. The worsening of a disease over time.
2424
CancerCancera a processprocess (of getting better or worse) (of getting better or worse)
an an objectobject (which can grow and spread) (which can grow and spread)
occurrent vs. continuantoccurrent vs. continuant
2525
DiseaseDiseaseDefinition1Definition1
A disease is any abnormal A disease is any abnormal conditioncondition of the body or mind that causes of the body or mind that causes discomfort, dysfunction, or distress discomfort, dysfunction, or distress to the person affected or those in to the person affected or those in contact with the person. ...contact with the person. ...
Definition2Definition2 A definite pathologic A definite pathologic processprocess with a with a characteristic set of signs and characteristic set of signs and symptoms. ...symptoms. ...
2626
Confuses definitions with Confuses definitions with descriptionsdescriptions
TuberculosisTuberculosis =Def. =Def.A chronic, recurrent infection caused by the bacterium A chronic, recurrent infection caused by the bacterium Mycobacterium tuberculosis. Tuberculosis (TB) may affect Mycobacterium tuberculosis. Tuberculosis (TB) may affect almost any tissue or organ of the body with the lungs being almost any tissue or organ of the body with the lungs being the most common site of infection. The clinical stages of TB the most common site of infection. The clinical stages of TB are primary or initial infection, latent or dormant infection, are primary or initial infection, latent or dormant infection, and recrudescent or adult-type TB. Ninety to 95% of primary and recrudescent or adult-type TB. Ninety to 95% of primary TB infections may go unrecognized. Histopathologically, TB infections may go unrecognized. Histopathologically, tissue lesions consist of granulomas which usually undergo tissue lesions consist of granulomas which usually undergo central caseation necrosis. Local symptoms of TB vary central caseation necrosis. Local symptoms of TB vary according to the part affected; acute symptoms include according to the part affected; acute symptoms include hectic fever, sweats, and emaciation; serious complications hectic fever, sweats, and emaciation; serious complications include granulomatous erosion of pulmonary bronchi include granulomatous erosion of pulmonary bronchi associated with hemoptysis. If untreated, progressive TB associated with hemoptysis. If untreated, progressive TB may be associated with a high degree of mortality. This may be associated with a high degree of mortality. This infection is frequently observed in immunocompromised infection is frequently observed in immunocompromised individuals with AIDS or a history of illicit IV drug use.individuals with AIDS or a history of illicit IV drug use.
2727
Confuses definitions with Confuses definitions with descriptionsdescriptions
TuberculosisTuberculosis =Def. =Def.A chronic, recurrent infection caused by the bacterium A chronic, recurrent infection caused by the bacterium Mycobacterium tuberculosis. Mycobacterium tuberculosis. Tuberculosis (TB) may affect Tuberculosis (TB) may affect almost any tissue or organ of the body with the lungs being almost any tissue or organ of the body with the lungs being the most common site of infection. The clinical stages of TB the most common site of infection. The clinical stages of TB are primary or initial infection, latent or dormant infection, are primary or initial infection, latent or dormant infection, and recrudescent or adult-type TB. Ninety to 95% of primary and recrudescent or adult-type TB. Ninety to 95% of primary TB infections may go unrecognized. Histopathologically, TB infections may go unrecognized. Histopathologically, tissue lesions consist of granulomas which usually undergo tissue lesions consist of granulomas which usually undergo central caseation necrosis. Local symptoms of TB vary central caseation necrosis. Local symptoms of TB vary according to the part affected; acute symptoms include according to the part affected; acute symptoms include hectic fever, sweats, and emaciation; serious complications hectic fever, sweats, and emaciation; serious complications include granulomatous erosion of pulmonary bronchi include granulomatous erosion of pulmonary bronchi associated with hemoptysis. If untreated, progressive TB associated with hemoptysis. If untreated, progressive TB may be associated with a high degree of mortality. This may be associated with a high degree of mortality. This infection is frequently observed in immunocompromised infection is frequently observed in immunocompromised individuals with AIDS or a history of illicit IV drug use.individuals with AIDS or a history of illicit IV drug use.
2828
A better definitionA better definitionTuberculosisTuberculosis Definition:Definition:A chronic, recurrent infection caused A chronic, recurrent infection caused
by the bacterium Mycobacterium by the bacterium Mycobacterium tuberculosis. tuberculosis.
2929
DuratecDuratec, , LactobutyrinLactobutyrin, , StilbeneStilbene AldehydeAldehyde
are classified by the NCIT as are classified by the NCIT as Unclassified Drugs and ChemicalsUnclassified Drugs and Chemicals
3030
NCIT recognizes threeNCIT recognizes three disjoint classes of plantsdisjoint classes of plants
Vascular PlantVascular PlantNon-vascular PlantNon-vascular PlantOther PlantOther Plant
3131
and three kinds of cellsand three kinds of cells
Abnormal CellAbnormal Cell is a top-level class (thus is a top-level class (thus not subsumed by not subsumed by CellCell ) )
Normal CellNormal Cell is a subclass of is a subclass of MicroanatomyMicroanatomy. .
Cell Cell is a subclass of is a subclass of Other Anatomic Other Anatomic Concept Concept (so that cells themselves are (so that cells themselves are concepts) concepts)
3232
NCIT as now constituted NCIT as now constituted will block automatic will block automatic
reasoningreasoning
Neither Neither Normal CellsNormal Cells nor nor Abnormal Abnormal CellsCells are are CellsCells within the context of within the context of the NCIT the NCIT
3333
UMLS Semantic UMLS Semantic NetworkNetwork
Alexa McCray, “An upper level Alexa McCray, “An upper level ontology for the biomedical ontology for the biomedical domain”. domain”. Comp Functional Comp Functional Genomics Genomics 2003; 4: 80-84.2003; 4: 80-84.
3434
UMLS Semantic UMLS Semantic NetworkNetwork
ProsProsBroad coverage; no multiple inheritanceBroad coverage; no multiple inheritanceConsConsIncoherent use of ‘conceptual entities’ Incoherent use of ‘conceptual entities’ (e.g. the digestive system as a(e.g. the digestive system as a
conceptual part conceptual part of the organism)of the organism)
3535
UMLS Semantic NetworkUMLS Semantic Network
Edges in the graph represent merely Edges in the graph represent merely “possible significant relations” :“possible significant relations” :– Bacterium Bacterium causes causes Experimental Model Experimental Model
of Diseaseof Disease– Experimental Model of Disease Experimental Model of Disease affectsaffects
FungusFungus– Experimental model of diseaseExperimental model of disease is_a is_a
Pathologic FunctionPathologic Function
3636a hodgepodge of ‘concepts’
3737
location_oflocation_of
Tissue Tissue location_oflocation_of Mental or Mental or Behavioral DysfunctionBehavioral Dysfunction
Fungus Fungus location_oflocation_of VitaminVitamin
3838
Fungus Fungus location_oflocation_of VitaminVitamin
Every instance of fungus is located in Every instance of fungus is located in some vitamin?some vitamin?
Every instance of fungus is located in Every instance of fungus is located in every vitamin?every vitamin?
Some instances of fungus are located Some instances of fungus are located in some vitamins?in some vitamins?
Some instances of vitamin have Some instances of vitamin have instances of fungi located in them?instances of fungi located in them?
3939what are the nodes in this graph?
4040
UMLS Semantic NetworkUMLS Semantic Network
A is_a B A is_a B =Def. =Def. A A is narrower in meaning than is narrower in meaning than BB
A disrupts BA disrupts BA contained_in BA contained_in B
4141
UMLS Semantic NetworkUMLS Semantic Network
Drug Delivery Device contains Drug Delivery Device contains Clinical Drug Clinical Drug
Drug Delivery Device Drug Delivery Device narrower_in_meaning_than narrower_in_meaning_than Manufactured ObjectManufactured Object
4242
General Ontological Overview
4343
Good ontologies require:Good ontologies require:Consistent use of terms, supported Consistent use of terms, supported by logically coherent (non-circular) by logically coherent (non-circular) definitions, in equivalent human-definitions, in equivalent human-readable and computable formatsreadable and computable formats
Coherent shared treatment of Coherent shared treatment of relations to allow cascading relations to allow cascading inference both within and between inference both within and between ontologiesontologies
4444
Three fundamental Three fundamental dichotomiesdichotomies
continuants vs. occurrentscontinuants vs. occurrents dependent vs. independent dependent vs. independent types vs. instancestypes vs. instances
ONTOLOGIES AREONTOLOGIES AREREPRESENTATIONS OF REPRESENTATIONS OF
TYPESTYPES
4545
ONTOLOGIES AREONTOLOGIES AREREPRESENTATIONS OF REPRESENTATIONS OF
TYPESTYPES
aka kinds, universals, aka kinds, universals, categories, species, genera, categories, species, genera,
......
4646
Molecules, cell components , Molecules, cell components , organisms organisms are independent are independent continuants which continuants which have have functionsfunctions
FunctionsFunctions are dependent continuants are dependent continuants which which become realized become realized through special through special sorts of processes we call functioningssorts of processes we call functionings
ProcessesProcesses (occurrents) include: (occurrents) include: functionings, side-effects, stochastic functionings, side-effects, stochastic processesprocesses
4747
Continuants (aka endurants)Continuants (aka endurants)– have continuous existence in timehave continuous existence in time– preserve their identity through changepreserve their identity through change– exist exist in totoin toto whenever they exist at all whenever they exist at all
Occurrents (aka processes)Occurrents (aka processes)– have temporal partshave temporal parts– unfold themselves in successive unfold themselves in successive
phasesphases– exist only in their phasesexist only in their phases
4848
YouYou are a continuant are a continuantYour Your lifelife is an occurrent is an occurrent
YouYou are 3-dimensional are 3-dimensionalYour Your lifelife is 4-dimensional is 4-dimensional
4949
Dependent entitiesDependent entities
require independent continuants as require independent continuants as their bearerstheir bearers
There is no grin without a catThere is no grin without a cat
5050
Dependent vs. independent Dependent vs. independent continuantscontinuants
Independent continuants (organisms, Independent continuants (organisms, cells, molecules, environments)cells, molecules, environments)
Dependent continuants (qualities, Dependent continuants (qualities, shapes, roles, propensities, shapes, roles, propensities, functionsfunctions))
5151
All occurrents are All occurrents are dependent entitiesdependent entities
They are dependent on those They are dependent on those independent continuants which are independent continuants which are their participants (agents, patients, their participants (agents, patients, media ...)media ...)
5252
Top-Level OntologyTop-Level Ontology
ContinuantOccurrent
(always dependent on one or more
independent continuants)
IndependentContinuant
DependentContinuant
5353
= A representation of top-level = A representation of top-level typestypes
Continuant Occurrent
IndependentContinuant
DependentContinuant
cell component
biological process
molecular function
5454
Top-Level OntologyTop-Level Ontology
Continuant Occurrent
IndependentContinuant
DependentContinuant Functioning Side-Effect,
Stochastic Process, ...
Function
5555
Top-Level OntologyTop-Level Ontology
Continuant Occurrent
IndependentContinuant
DependentContinuant Functioning Side-Effect,
Stochastic Process, ...
Function
5656
Top-Level OntologyTop-Level Ontology
Continuant Occurrent
IndependentContinuant
DependentContinuant
Quality Function Spatial Region
Functioning Side-Effect, Stochastic Process, ...
instances (in space and time)
5757
Smith B, Ceusters W, Kumar A, Rosse C. On Carcinomas and Other Pathological Entities, Comp Functional Genomics, Apr. 2006
58
everything here is an independent continuant
5959
Functions, etc.Functions, etc.
Some dependent continuants Some dependent continuants are are realizablerealizable
expression of a geneexpression of a geneapplication of a therapyapplication of a therapycourse of a diseasecourse of a diseaseexecution of an algorithmexecution of an algorithmrealization of a protocolrealization of a protocol
6060
Functions vs Functions vs FunctioningsFunctionings
the function of your heart = to pump the function of your heart = to pump blood in your bodyblood in your bodythis function is this function is realizedrealized in processes in processes of pumping blood of pumping blood not all functions are not all functions are realized realized (consider the function of this (consider the function of this sperm ...)sperm ...)
6161
The OBO FoundryThe OBO Foundry
6262
High quality shared High quality shared ontologies ontologies
build communitiesbuild communitiesGeneral trend on the part of NIH, FDA General trend on the part of NIH, FDA and other bodies to consolidate and other bodies to consolidate ontology-based standards for the ontology-based standards for the communication and processing of communication and processing of biomedical data. biomedical data.
caBIG / NECTAR / BIRN / BRIDG / OBO caBIG / NECTAR / BIRN / BRIDG / OBO ......
6363
Responses to this trendResponses to this trend
Old style: UMLS (Unified Medical Old style: UMLS (Unified Medical Language System) – rooted in Language System) – rooted in faithfulness to the ways language is faithfulness to the ways language is used by different medical used by different medical communitiescommunities
New style: OBO Foundry – pre-emptive New style: OBO Foundry – pre-emptive regimentation of language, structure regimentation of language, structure and formatand format
6464
Two strategies for creating Two strategies for creating terminologies and database terminologies and database
schemasschemas Ad hoc creation by each clinical or research Ad hoc creation by each clinical or research
communitycommunityvs.vs.Pre-established reference ontologies upon Pre-established reference ontologies upon which specific local applications can drawwhich specific local applications can draw
6565
We know that high-quality We know that high-quality ontologies can helpontologies can help
in creating better mappings between in creating better mappings between human and model organism human and model organism phenotypesphenotypes
S Zhang, O Bodenreider, “Alignment of S Zhang, O Bodenreider, “Alignment of Multiple Ontologies of Anatomy: Deriving Multiple Ontologies of Anatomy: Deriving Indirect Mappings from Direct Mappings to Indirect Mappings from Direct Mappings to a Reference Ontology”, AMIA 2005a Reference Ontology”, AMIA 2005
66
The solutionThe OBO FoundryThe OBO Foundry
http://ontology.buffalo.edu/obofoundry
6767
Goals: Goals: to create the conditions for a step-to create the conditions for a step-by-step evolution towards robust by-step evolution towards robust gold standardgold standard reference ontologies in reference ontologies in the biomedical domainthe biomedical domainto introduce some of the features of to introduce some of the features of scientific peer reviewscientific peer review into biomedical into biomedical ontology developmentontology development
The OBO FoundryThe OBO Foundry
6868
Goal:Goal:to create controlled vocabularies for to create controlled vocabularies for use by clinical trial banks, clinical use by clinical trial banks, clinical guidelines bodies, scientific guidelines bodies, scientific journals, ...journals, ...
The OBO FoundryThe OBO Foundry
6969
OBO FoundryOBO Foundry
A subset of OBO ontologies whose developers A subset of OBO ontologies whose developers agree in advance to accept a common set of agree in advance to accept a common set of principles designed to assure principles designed to assure – intelligibility to biologist curators, annotators, usersintelligibility to biologist curators, annotators, users– formal robustness formal robustness – stabilitystability– compatibilitycompatibility– interoperability interoperability – support for logic-based reasoningsupport for logic-based reasoning
The OBO FoundryThe OBO Foundry
7070
OBO FoundryOBO Foundry– OBO-UBO / Ontology of Biomedical RealityOBO-UBO / Ontology of Biomedical Reality– OBO Relation OntologyOBO Relation Ontology– Gene OntologyGene Ontology– Sequence OntologySequence Ontology– RNA OntologyRNA Ontology– PATO Phenotype OntologyPATO Phenotype Ontology– FuGO Functional Genomics Investigation FuGO Functional Genomics Investigation
OntologyOntology– Mk. II NCI ThesaurusMk. II NCI Thesaurus– FMA (?)FMA (?)
The OBO FoundryThe OBO Foundry
7171
A reference ontologyA reference ontology
is analogous to a scientific theory; it is analogous to a scientific theory; it seeks to optimize representational seeks to optimize representational adequacy to its subject matter to the adequacy to its subject matter to the maximal degree that is compatible maximal degree that is compatible with the constraints of computational with the constraints of computational usefulness. usefulness.
7272
An application ontologyAn application ontologyis comparable to an engineering artifact is comparable to an engineering artifact such as a software tool. It is constructed such as a software tool. It is constructed for a specific practical purpose.for a specific practical purpose.
Examples: Examples: NCITNCIT
FuGO Functional Genomics Investigation FuGO Functional Genomics Investigation OntologyOntology
7373
Reference Ontology vs. Reference Ontology vs. Application OntologyApplication Ontology
Currently application ontologies are often Currently application ontologies are often built afresh for each new task; commonly built afresh for each new task; commonly introducing not only idiosyncrasies of format introducing not only idiosyncrasies of format or logic, but also simplifications or or logic, but also simplifications or distortions of their subject-matters. distortions of their subject-matters. To solve this problem application ontology To solve this problem application ontology development shoud take place always development shoud take place always against the background of a formally robust against the background of a formally robust reference ontology frameworkreference ontology framework
7474
CRITERIACRITERIA
http://ontology.buffalo.edu/obofoundryhttp://ontology.buffalo.edu/obofoundry
The OBO FoundryThe OBO Foundry
7575
The ontology isThe ontology is open open and available to be used by and available to be used by all.all.
The developers of the ontology agree in advance The developers of the ontology agree in advance to to collaborate collaborate with developers of other OBO with developers of other OBO Foundry ontology where domains overlap.Foundry ontology where domains overlap.
The ontology is in, or can be instantiated in, a The ontology is in, or can be instantiated in, a common formal languagecommon formal language..
The ontology possesses a The ontology possesses a unique identifier unique identifier space space within OBO. within OBO.
The ontology provider has procedures for The ontology provider has procedures for identifying distinct successive identifying distinct successive versionsversions. .
7676
The ontology has a clearly specified and The ontology has a clearly specified and clearly clearly delineated content.delineated content.
The ontology includes The ontology includes textual definitionstextual definitions for all for all terms. terms.
The ontology is The ontology is well-documented.well-documented.The ontology has a plurality of The ontology has a plurality of independent independent
usersusers..The ontology uses relations which are The ontology uses relations which are
unambiguously defined following the unambiguously defined following the pattern of definitions laid down in the pattern of definitions laid down in the OBO OBO Relation OntologyRelation Ontology..
7777
CRITERIACRITERIA
Further criteria will be added over Further criteria will be added over time in order to bring about a time in order to bring about a gradual improvement in the quality gradual improvement in the quality of the ontologies in the Foundryof the ontologies in the Foundry
The OBO FoundryThe OBO Foundry
7878
Advantages of the Advantages of the methodology of shared methodology of shared
coherently defined coherently defined definitionsdefinitions
promotes quality assurance (better promotes quality assurance (better coding)coding)
guarantees automatic reasoning guarantees automatic reasoning across ontologies and across data at across ontologies and across data at different granularitiesdifferent granularities
yields direct connection to temporally yields direct connection to temporally indexed instance dataindexed instance data
7979
Rules for Good Rules for Good OntologiesOntologies
8080
A basic distinctionA basic distinctiontype vs. instancetype vs. instance
science text vs. clinical documentscience text vs. clinical document
‘‘man’man’ vs. ‘Michael’vs. ‘Michael’
8181
Instances are not Instances are not represented in an represented in an
ontologyontologyFor ontology, it is the scientific For ontology, it is the scientific generalizations that are importantgeneralizations that are important
(but instances must still be taken (but instances must still be taken into account)into account)
8282
A A 515287 515287 DC3300 Dust Collector DC3300 Dust Collector FanFan
B B 521683 521683 Gilmer BeltGilmer BeltC C 521682 521682 Motor Drive BeltMotor Drive Belt
8383
Ontology Types Ontology Types InstancesInstances
8484
Ontology = Ontology = A Representation of TypesA Representation of Types
8585
Ontology = Ontology = A Representation of TypesA Representation of Types
Each node of an ontology consists of:
• preferred term (aka term)
• term identifier (TUI, aka CUI)
• synonyms
• definition, glosses, comments
8686
Ontology = Ontology = A Representation of TypesA Representation of Types
Nodes in an ontology are connected by relations:
primarily: is_a (= is subtype of) and part_of
designed to support search, reasoning and annotation
8787
siamese
mammal
cat
organism
substancetypestypes
animal
instances
frog
8888
Motivation: To capture Motivation: To capture realityreality
Inferences and decisions we make are Inferences and decisions we make are based upon what we know of reality.based upon what we know of reality.
An ontology is a computable An ontology is a computable representation of biological reality, representation of biological reality, which is designed to enable a which is designed to enable a computer to reason over the data we computer to reason over the data we collect about this reality in (some of) collect about this reality in (some of) the ways that we do.the ways that we do.
8989
ConceptsConcepts
Biomedical ontology integration will never Biomedical ontology integration will never be achieved through integration of be achieved through integration of meanings or conceptsmeanings or concepts
The problem is precisely that different The problem is precisely that different user communities use user communities use different concepts different concepts
Concepts are in your head and will change Concepts are in your head and will change as your understanding changesas your understanding changes
9090
ConceptsConcepts
Ontologies represent Ontologies represent typestypes: not : not concepts, meanings, ideas ...concepts, meanings, ideas ...
Types exist, with their instances, in Types exist, with their instances, in objective realityobjective reality
– – including types of image, of imaging including types of image, of imaging process, of brain region, of clinical process, of brain region, of clinical procedure, etc.procedure, etc.
9191
Rules on typesRules on typesDon’t confuse types with wordsDon’t confuse types with wordsDon’t confuse types with conceptsDon’t confuse types with conceptsDon’t confuse types with ways of Don’t confuse types with ways of
getting to know typesgetting to know typesDon’t confuse types with ways of Don’t confuse types with ways of
talking about typestalking about typesDon’t confuses types with data about Don’t confuses types with data about
typestypes
9292
Some other simple rules for Some other simple rules for high quality ontologieshigh quality ontologies
9393
Univocity Univocity
Terms should have the same meanings Terms should have the same meanings on every occasion of use.on every occasion of use.
They should refer to the same kinds of They should refer to the same kinds of entities in realityentities in reality
Basic ontological relations such as Basic ontological relations such as is_a is_a and and part_ofpart_of should be used in the should be used in the same way by all ontologiessame way by all ontologies
9494
PositivityPositivity
Complements of types are not Complements of types are not themselves types. themselves types.
Hence terms such as Hence terms such as non-mammalnon-mammal non-membranenon-membrane other metalworker in New Zealandother metalworker in New Zealand
do not designate types in realitydo not designate types in reality
9595
Ontology of types Ontology of types logic of logic of termsterms
There are no conjunctive and disjunctive There are no conjunctive and disjunctive types: types:
anatomic structure, system, or anatomic structure, system, or substancesubstance
musculoskeletal and connective tissue musculoskeletal and connective tissue disorderdisorder
rheumatism, excluding the backrheumatism, excluding the back
9696
ObjectivityObjectivity
Which types exist in reality is not a Which types exist in reality is not a function of our knowledge.function of our knowledge.
Terms such asTerms such asunknownunknownunclassifiedunclassifiedunlocalizedunlocalizedarthropathies not otherwise specifiedarthropathies not otherwise specified
do not designate types in reality.do not designate types in reality.
9797
Keep Epistemology Separate Keep Epistemology Separate from Ontologyfrom Ontology
If you want to say that If you want to say that We do not know where We do not know where A’A’ss are locatedare located
do not invent a new class of do not invent a new class of A’s with unknown locationsA’s with unknown locations(A well-constructed ontology should grow (A well-constructed ontology should grow linearly; it should not need to linearly; it should not need to delete delete classes or relations because of classes or relations because of increases increases in in knowledge)knowledge)
9898
Syntactic SeparatenessSyntactic SeparatenessDo not confuse sentences with termsDo not confuse sentences with terms
If you want to sayIf you want to say
I surmise that this is a case of I surmise that this is a case of pneumoniapneumonia
do not invent a new class of do not invent a new class of surmised surmised pneumoniaspneumonias
9999
Single InheritanceSingle Inheritance
No kind in a classificatory No kind in a classificatory hierarchy should have more hierarchy should have more than one than one is_a is_a parent on the parent on the immediate higher levelimmediate higher level
100100
Multiple InheritanceMultiple Inheritance
thingthing
carcar
blue thingblue thing
blue carblue car
is_a is_a
101101
Multiple InheritanceMultiple Inheritance
is a source of errorsis a source of errorsencourages lazinessencourages lazinessserves as obstacle to integration with serves as obstacle to integration with
neighboring ontologiesneighboring ontologieshampers use of Aristotelian hampers use of Aristotelian
methodology for defining termsmethodology for defining terms
102102
Multiple InheritanceMultiple Inheritance
thingthing
carcar
blue thingblue thing
blue carblue car
is_a1 is_a2
103103
is_ais_a Overloading Overloading
The success of ontology alignment The success of ontology alignment demands that ontological relations demands that ontological relations ((is_a, part_of, ...is_a, part_of, ...) have the same ) have the same meanings in the different ontologies meanings in the different ontologies to be aligned. to be aligned.
104104
Example: Example: is_a is_a is pressed into is pressed into service by the GO to express service by the GO to express
locationlocationis-located-at is-located-at and similar relations and similar relations are expressed by creating special are expressed by creating special compound terms using:compound terms using:site of …site of …… … within …within …… … in …in …extrinsic to …extrinsic to …yielding associated errorsyielding associated errors
105105
e.g. errors with ‘within’e.g. errors with ‘within’lytic vacuole within a protein storage lytic vacuole within a protein storage
vacuolevacuole
lytic vacuole within a protein storage lytic vacuole within a protein storage vacuole vacuole is-a is-a protein storage vacuoleprotein storage vacuole
Compare:Compare:embryo within a uterus embryo within a uterus is-a is-a uterusuterus
106106
similar problems with similar problems with part_ofpart_of
extrinsic to membrane extrinsic to membrane part_of part_of membranemembrane
107107
CompositionalityCompositionality
The meanings of compound terms The meanings of compound terms should be determined should be determined 1. by the meanings of component 1. by the meanings of component termsterms
together withtogether with2. the rules governing syntax2. the rules governing syntax
108108
Why do we need Why do we need rules/standards for good rules/standards for good
ontology?ontology?Ontologies must be intelligible both to humans Ontologies must be intelligible both to humans
(for annotation and curation) and to (for annotation and curation) and to machines (for reasoning and error-checking): machines (for reasoning and error-checking): the lack of rules for classification leads to the lack of rules for classification leads to human error and blocks automatic reasoning human error and blocks automatic reasoning and error-checkingand error-checking
Intuitive rules facilitate training of curators and Intuitive rules facilitate training of curators and annotatorsannotators
Common rules allow alignment with other Common rules allow alignment with other ontologiesontologies
109109
OBO Relation OntologyOBO Relation Ontology
110110
First stepFirst step
Alignment of OBO Foundry ontologies Alignment of OBO Foundry ontologies through a common system of through a common system of formally defined relations in the OBO formally defined relations in the OBO Relation OntologyRelation Ontology
See “Relations in Biomedical See “Relations in Biomedical Ontologies”, Ontologies”, Genome Biology Genome Biology Apr. Apr. 20052005
111111
Judith Blake:Judith Blake:
““The use of bio-ontologies … ensures The use of bio-ontologies … ensures consistency of data curation, consistency of data curation, supports extensive data integration, supports extensive data integration, and enables robust exchange of and enables robust exchange of information between heterogeneous information between heterogeneous informatics systems. .. informatics systems. .. ontologies … formally define ontologies … formally define relationships between the concepts.”relationships between the concepts.”
112112
"Gene Ontology: Tool for "Gene Ontology: Tool for the Unification of Biology"the Unification of Biology"
an ontology "comprises a set of well-an ontology "comprises a set of well-defined terms with well-defined defined terms with well-defined relationships" relationships" (Ashburner (Ashburner et alet al., 2000, p. 27)., 2000, p. 27)
113113
is_a is_a ((sensusensu UMLS) UMLS)A A is_ais_a B = B =defdef
‘‘A A ’ is narrower in meaning than ‘’ is narrower in meaning than ‘B B ’’
grows out of the heritage of grows out of the heritage of dictionariesdictionaries
(which ignore the basic distinction (which ignore the basic distinction between types and instances)between types and instances)
114114
is_ais_a
congenital absent nipple is_a nipplecongenital absent nipple is_a nipplecancer documentation is_a cancercancer documentation is_a cancerdisease prevention is_a diseasedisease prevention is_a diseaseNazism is_a social scienceNazism is_a social science
115115
is_a is_a (sensu logic)(sensu logic)A A is_ais_a B = B =defdef
For all For all x, x, if if x x instance_of instance_of A A then then x x instance_of instance_of BB
cell division cell division is_a is_a biological processbiological process
adult adult is_a is_a child ???child ???
116116
Two kinds of entitiesTwo kinds of entities
occurrents occurrents (processes, events, (processes, events, happenings)happenings)cell division, ovulation, deathcell division, ovulation, death
continuantscontinuants (objects, qualities, ...) (objects, qualities, ...)cell, ovum, organism, temperature of cell, ovum, organism, temperature of organism, ...organism, ...
117117
is_a is_a (for occurrents)(for occurrents)
A A is_ais_a B = B =defdef
For all For all x, x, if if x x instance_of instance_of A A then then x x instance_of instance_of BB
cell division cell division is_a is_a biological processbiological process
118118
is_a (for continuants)is_a (for continuants)
A A is_ais_a B = B =defdef
For all For all x, t x, t if if x x instance_of instance_of A A at at t t then then x x instance_of instance_of B B at at tt
abnormal cell is_a cellabnormal cell is_a celladult human is_a humanadult human is_a humanbut not: but not: adult is_a childadult is_a child
119119
Part_of Part_of as a relation between as a relation between types is more problematic types is more problematic
than is standardly supposedthan is standardly supposed
heart part_of human being ?heart part_of human being ?human heart part_of human being ?human heart part_of human being ?human being has_part human testis ?human being has_part human testis ?human testis part_of human being ?human testis part_of human being ?
120120
two kinds of parthoodtwo kinds of parthood1.1. between instances:between instances:
Mary’s heart Mary’s heart part_of part_of MaryMarythis nucleus this nucleus part_of part_of this cellthis cell
2.2. between typesbetween typeshuman heart part_of humanhuman heart part_of humancell nucleus part_of cellcell nucleus part_of cell
121121
Definition of Definition of part_of part_of as a as a relation between typesrelation between types
A part_of B =A part_of B =Def Def all all instances of instances of A A are instance-level parts ofare instance-level parts of some some instance of instance of BB
ALL–SOME STRUCTUREALL–SOME STRUCTURE
122122
part_of part_of (for occurrents)(for occurrents)A part_of B =A part_of B =DefDef
For all For all x, x, if if x x instance_of instance_of A A then there then there is some is some y, y y, y instance_of instance_of B B and and x x part_of part_of yywhere ‘where ‘part_ofpart_of’ is the instance-level ’ is the instance-level part relationpart relation
123123
part_of part_of (for continuants)(for continuants)A A part_of part_of B =B =def.def.
For all For all x, t x, t if if x x instance_of instance_of A A at at t t then then there is some there is some y, y y, y instance_of instance_of B B at at t t and and x x part_of part_of yy
where ‘where ‘part_ofpart_of’ is the instance-level ’ is the instance-level part relationpart relation
ALL-SOME ALL-SOME STRUCTURE STRUCTURE
124124
How to use the OBO How to use the OBO Relation OntologyRelation Ontology
Ontologies are representations of types and Ontologies are representations of types and of the relations between typesof the relations between types
The The definitions definitions of these relations involve of these relations involve reference to times and instances, but these reference to times and instances, but these references are washed out when we get to references are washed out when we get to the the assertions assertions (edges) in the ontology(edges) in the ontology
But curators should still be aware of the But curators should still be aware of the underlying definitions when formulating underlying definitions when formulating such assertionssuch assertions
125125
part_of part_of (for occurrents)(for occurrents)A part_of B =A part_of B =DefDef
For all For all x, x, if if x x instance_of instance_of A A then there then there is some is some y, y y, y instance_of instance_of B B and and x x part_of part_of yywhere ‘where ‘part_ofpart_of’ is the instance-level ’ is the instance-level part relationpart relation
126126
A part_of BA part_of B, , B part_of C ...B part_of C ...
The The all-some all-some structure of such structure of such definitions allowsdefinitions allows
cascading of inferences (true path cascading of inferences (true path rule)rule)(i) within ontologies(i) within ontologies(ii) between ontologies(ii) between ontologies(iii) between ontologies and (iii) between ontologies and repositories of instance-datarepositories of instance-data
127127
Strengthened true path Strengthened true path rulerule
Whichever Whichever A A you choose, the instance of you choose, the instance of B B of which it is a part will be included of which it is a part will be included in some in some CC, which will include as part , which will include as part also the also the A A with which you beganwith which you began
The same principle applies to the other The same principle applies to the other relations in the OBO-RO:relations in the OBO-RO:
located_atlocated_at, , transfortransformmationation__ofof, , derivderiveed_from, adjacent_d_from, adjacent_ttoo, etc., etc.
128128
Kinds of relationsKinds of relations
Between types:Between types:– is_ais_a, , part_ofpart_of, ..., ...
Between an instance and a typeBetween an instance and a type– this explosion this explosion instance_ofinstance_of the type the type
explosionexplosionBetween instances:Between instances:
– Mary’s heart Mary’s heart part_ofpart_of Mary Mary
129129
In every ontologyIn every ontologysome terms and some relations are some terms and some relations are primitive primitive = they cannot be defined = they cannot be defined (on pain of infinite regress)(on pain of infinite regress)
Examples of primitive relations:Examples of primitive relations:– identityidentity– instantiationinstantiation– (instance-level) (instance-level) part_ofpart_of– (instance-level) (instance-level)
continuous_withcontinuous_with
130130
Fiat and bona fide boundaries
131131
ContinuityAttachmentAdjacency
132132
everything here is an independent continuant
133133
structures vs. formations = bona fide vs. fiat boundaries
134134
Modes of ConnectionModes of Connection
The body is a highly connected The body is a highly connected entity. entity.
Exceptions: cells floating free in Exceptions: cells floating free in blood.blood.
135135
Modes of ConnectionModes of Connection
Modes of connection:Modes of connection:attached_to attached_to (muscle to bone) (muscle to bone) synapsed_with synapsed_with (nerve to nerve, (nerve to nerve, nerve to muscle)nerve to muscle)
continuous_with continuous_with (= share a fiat (= share a fiat boundary)boundary)
136
articular eminencearticular (glenoid)fossa
ANTERIOR
Attachment, location, containment
137
Containment involves relation to a hole or cavity
1: cavity2: tunnel, conduit (artery)3: mouth; a snail’s shell
138
Fiat vs. Bona Fide Boundaries
Fiat boundary Physical boundary
139
Double Hole Structure
Medium (filling the environing hole)
Tenant (occupying the central hole)
Retainer (a boundary of some surrounding structure)
140
head of condyle
neck of condyle
fossa
fiat boundary
THE THE TEMPOROMANDIBULAR TEMPOROMANDIBULAR
JOINTJOINT
141141
continuous_withcontinuous_with(a relation between instances (a relation between instances which share a fiat boundary)which share a fiat boundary)is always symmetric:is always symmetric:
if if x x continuous_with continuous_with y , y , then then y y continuous_with continuous_with xx
142142
continuous_withcontinuous_with(relation between types)(relation between types)
A continuous_with B A continuous_with B =Def. =Def.
for all for all x, x, if if x x instance-of instance-of A A then there then there is some is some y y such that such that y y instance_ofinstance_of B B and and x x continuous_with continuous_with yy
143143
continuous_withcontinuous_with is not always is not always symmetricsymmetric
Consider Consider lymph node lymph node and and lymphatic lymphatic vessel:vessel:
Each lymph node is continuous with Each lymph node is continuous with some lymphatic vessel, but there are some lymphatic vessel, but there are lymphatic vessels (e.g. lymphs and lymphatic vessels (e.g. lymphs and lymphatic trunks) which are not lymphatic trunks) which are not continuous with any lymph nodescontinuous with any lymph nodes
144144
Adjacent_toAdjacent_toas a relation between types as a relation between types
is not symmetricis not symmetricConsiderConsider
seminal vesicle adjacent_to seminal vesicle adjacent_to urinary bladderurinary bladder
Not: Not: urinary bladderurinary bladder adjacent_to adjacent_to seminal vesicleseminal vesicle
145145
instance levelinstance levelthis nucleus is adjacent to this this nucleus is adjacent to this
cytoplasmcytoplasmimplies:implies:
this cytoplasm is adjacent to this this cytoplasm is adjacent to this nucleusnucleus
type leveltype levelnucleus adjacent_to cytoplasmnucleus adjacent_to cytoplasmNot: Not: cytoplasm adjacent_to nucleuscytoplasm adjacent_to nucleus
146146
ApplicationsApplications
Expectations of symmetry e.g. for Expectations of symmetry e.g. for protein-protein interactions may hold protein-protein interactions may hold only at the instance levelonly at the instance level
if if A A interacts with interacts with BB, it does not follow , it does not follow that that B B interacts with interacts with AA
if if A A is expressed simultaneously with is expressed simultaneously with BB, it does not follow that , it does not follow that B B is is expressed simultaneously withexpressed simultaneously with AA
147
c at t1
C c at t
C1
time
same instance
transformation_of
pre-RNA mature RNAadultchild
148148
transformation_oftransformation_ofA transformation_of B A transformation_of B =Def. =Def. Every instance of Every instance of A A was at some was at some earlier time an instance of earlier time an instance of BB
adult transformation_of childadult transformation_of child
149
C c at t c at t1
C1
tumor development
150
C c at t
C1
c1 at t1
C'
c' at t
time
instances
zygote derives_fromovumsperm
derives_from
151
two continuants fuse to form a new continuant
C c at t
C1
c1 at t1
C'
c' at t fusion
152
one initial continuant is replaced by two successor continuants
C c at t
C1 c1 at t1
C2
c1 at t1
fission
153
one continuant detaches itself from an initial continuant, which itself continues to exist
C c at t c at t1
C1
c1 at t
budding
154
one continuant absorbs a second continuant while itself continuing to exist
C c at t
c at t1
C'
c' at t capture
155155
A suite of defined relations A suite of defined relations between typesbetween types
FoundationFoundational al
is_ais_apart_ofpart_of
SpatialSpatial located_inlocated_incontained_incontained_inadjacent_toadjacent_to
TemporalTemporal transformation_oftransformation_ofderives_fromderives_frompreceded_bypreceded_by
ParticipatioParticipation n
has_participanthas_participanthas_agenthas_agent
156156
To be added to the Relation To be added to the Relation OntologyOntology
lacks lacks (between an instance and a (between an instance and a type, e.g. type, e.g. this fly lacks wingsthis fly lacks wings))
dependent_on dependent_on (between a dependent (between a dependent entity and its carrier or bearer)entity and its carrier or bearer)
quality_of quality_of (between a dependent and (between a dependent and an independent continuant)an independent continuant)
functioning_of functioning_of (between a process (between a process and an independent continuant)and an independent continuant)
157157
Low Hanging FruitLow Hanging FruitOntologies should include only those Ontologies should include only those
relational assertions which hold relational assertions which hold universally (= have the ALL-SOME form)universally (= have the ALL-SOME form)
Often, order will matter here:Often, order will matter here:We can includeWe can include
adult transformation_of childadult transformation_of childbut notbut not
child transforms_into adultchild transforms_into adult
158158
The Gene OntologyThe Gene Ontology
159
GO’s three ontologies
molecular functions
cellular components
biological processes
160160
When a gene is When a gene is identifiedidentified
three types of questions need to be three types of questions need to be addressed: addressed:
1. Where is it located in the cell? 1. Where is it located in the cell? 2. What functions does it have on the 2. What functions does it have on the
molecular level? molecular level? 3. To what biological processes do 3. To what biological processes do
these functions contribute? these functions contribute?
161161
Three granularities:Three granularities:
Cellular (for components)Cellular (for components)Molecular (for functions)Molecular (for functions)Organ + organism (for processes)Organ + organism (for processes)
162162
GO has cellsGO has cellsbut it does not include terms for but it does not include terms for molecules or organisms within any of molecules or organisms within any of its three ontologiesits three ontologiesexcept e.g. GO:0018995 except e.g. GO:0018995 hosthost=Def. Any organism in which another =Def. Any organism in which another organism spends part or all of its life organism spends part or all of its life cycle cycle
163163
Are the relations between Are the relations between functions and processes a functions and processes a
matter of granularity?matter of granularity?
Molecular activities are the ‘building Molecular activities are the ‘building blocks’ of biological processes ?blocks’ of biological processes ?
But they are not allowed to be But they are not allowed to be represented in GO as represented in GO as parts parts of of biological processesbiological processes
164
GO’s three ontologies
molecular functions
cellular components
biological processes
165165
What does “function” mean?What does “function” mean?
an entity has a biological function if an entity has a biological function if and only if it is part of an organism and only if it is part of an organism and has a disposition to act reliably and has a disposition to act reliably in such a way as to contribute to the in such a way as to contribute to the organism’s survivalorganism’s survival
the function is this dispositionthe function is this disposition
166166
Improved versionImproved versionan entity has a biological an entity has a biological function if and only if it is part of function if and only if it is part of an organism and has a an organism and has a disposition to act reliably in such disposition to act reliably in such a way as to contribute to the a way as to contribute to the organism’s realization of the organism’s realization of the canonical life plancanonical life plan for an for an organism of that typeorganism of that type
167167
This canonical life plan This canonical life plan might includemight include
canonical embryological canonical embryological developmentdevelopment
canonical growthcanonical growthcanonical reproductioncanonical reproductioncanonical agingcanonical agingcanonical deathcanonical death
168168
The function of the heart is The function of the heart is to to pump bloodpump blood
Not every activity (process) in an Not every activity (process) in an organism is the exercise of a function organism is the exercise of a function – there are – there are – mal mal functioningsfunctionings– side-effects (heart side-effects (heart beatingbeating))– accidents (external accidents (external
interference)interference)– background stochastic activitybackground stochastic activity
169169
KidneyKidney
170170
Nephron
171171
Functional Segments
172172
Functions
173173
FunctionsFunctionsThis is a screwdriverThis is a screwdriverThis is a good screwdriverThis is a good screwdriverThis is a broken screwdriverThis is a broken screwdriver
This is a heartThis is a heartThis is a healthy heartThis is a healthy heartThis is an unhealthy heartThis is an unhealthy heart
174174
Functions are associated with Functions are associated with certain characteristic certain characteristic process process
shapesshapesScrewdriver: rotates and Screwdriver: rotates and
simultaneously moves forward simultaneously moves forward simultaneously transferring torque simultaneously transferring torque from hand and arm to screw from hand and arm to screw
Heart: performs a contracting Heart: performs a contracting movement inwards and an movement inwards and an expanding movement outwardsexpanding movement outwards
175175
Not functioning at allNot functioning at allleads to leads to death,death, modulomodulo internal factors:internal factors:
plasticity plasticity redundancy (2 kidneys)redundancy (2 kidneys)criticality of the system involvedcriticality of the system involved
external factors:external factors:prosthesis (dialysis machines, oxygen tent)prosthesis (dialysis machines, oxygen tent)special environmentsspecial environmentsassistance from other organismsassistance from other organisms
176176
What clinical medicine is What clinical medicine is forfor
to eliminate malfunctioning by fixing to eliminate malfunctioning by fixing broken body partsbroken body parts(or to prevent the appearance of (or to prevent the appearance of malfunctioning by intervening e.g. at malfunctioning by intervening e.g. at the molecular level)the molecular level)
177177
Hypothesis: there are no Hypothesis: there are no ‘bad’ functions‘bad’ functions
It is not the function of an oncogene It is not the function of an oncogene to cause cancer to cause cancer Oncogenes were in every case proto-Oncogenes were in every case proto-oncogenes with functions of their oncogenes with functions of their ownownThey become oncogenes because of They become oncogenes because of bad (non-prototypical) environmentsbad (non-prototypical) environments
178178
Is there an exception for Is there an exception for molecular functions?molecular functions?
Does this apply only to functions on Does this apply only to functions on biological levels of granularitybiological levels of granularity
(= levels of granularity coarser than (= levels of granularity coarser than the molecule) ?the molecule) ?
If pathology is the deviation from If pathology is the deviation from (normal) functioning, does it make (normal) functioning, does it make sense to talk of a pathological sense to talk of a pathological molecule?molecule?
179179
Is there an exception for Is there an exception for molecular functions?molecular functions?
A molecular function is a propensity of a A molecular function is a propensity of a gene product instance to perform actions on gene product instance to perform actions on the molecular level of granularity. the molecular level of granularity. Hypothesis 1: these actions must be reliably Hypothesis 1: these actions must be reliably such as to contribute to biological such as to contribute to biological processes.processes.Hypothesis 2: these actions must be reliably Hypothesis 2: these actions must be reliably such as to contribute to the organism’s such as to contribute to the organism’s realization of the canonical life planrealization of the canonical life plan for an for an organism of that type.organism of that type.
180180
The Gene OntologyThe Gene Ontologyis a canonical ontology – it represents is a canonical ontology – it represents
only what is normal in the realm of only what is normal in the realm of molecular functioningmolecular functioning
181181
The GO is a canonical The GO is a canonical representationrepresentation
““The Gene Ontology is a The Gene Ontology is a computational representation of the computational representation of the ways in which gene products ways in which gene products normally function in the biological normally function in the biological realm”realm”
Nucl. Acids Res. Nucl. Acids Res. 2006: 34.2006: 34.
182182
The FMA is a canonical The FMA is a canonical representationrepresentation
It is a computational representation It is a computational representation of types and relations between types of types and relations between types deduced from the qualitative deduced from the qualitative observations of the observations of the normal normal human human body, which have been refined and body, which have been refined and sanctioned by successive sanctioned by successive generations of anatomists and generations of anatomists and presented in textbooks and atlases of presented in textbooks and atlases of structural anatomy. structural anatomy.
183183
The importance of The importance of pathways (successive pathways (successive
causality)causality)Each stage in the history of a disease Each stage in the history of a disease
presupposes the earlier stagespresupposes the earlier stagesTherefore need to reason across time, Therefore need to reason across time,
tracking the order of events in time, tracking the order of events in time, using relations such as using relations such as derives_fromderives_from, , transformation_of ...transformation_of ...
Need pathway ontologies on every Need pathway ontologies on every level of granularitylevel of granularity
184184
The importance of The importance of granularity (simultaneous granularity (simultaneous
causality)causality)Networks are continuantsNetworks are continuantsAt any given time there are networks At any given time there are networks
existing in the organism at different levels existing in the organism at different levels of granularityof granularity
Changes in one cause simultaneous changes Changes in one cause simultaneous changes in all the othersin all the others
(Compare Boyle’s law: a rise in temperature (Compare Boyle’s law: a rise in temperature causes a simultaneous increase in causes a simultaneous increase in pressure)pressure)
185185
The Granularity GulfThe Granularity Gulfmost existing data-sources are of most existing data-sources are of
fixed, single granularityfixed, single granularitymany (all?) clinical phenomena many (all?) clinical phenomena
crosscross granularities granularitiesTherefore need to reason across Therefore need to reason across
time, tracking the order of events time, tracking the order of events in timein time
186
GO’s three ontologies
molecular function
cellular component
biological processdependent
independent
187
GO’s three ontologies
molecular function
cellular component
organism-level
biological process
cellularprocess
188
molecular function
molecule
cellularprocess
cellular component
organism-level
biological process
organism
Normalization of Granular Levels
189
molecule cellular component
molecular function
cellularfunction
organism-level
biological function
organism
molecular process
cellularprocess
organism-level
biological process
190
moleculecellular
component
molecular function
cellularfunction
organism-level
biological function
organism
molecular process
cellularprocess
organism-level
biological process
functioning functioning functioning
191
molecule cellular component
molecular function
cellularfunction
organism-level
biological function
organism
molecular process
cellularprocess
organism-level
process
functioningsfunctionings functionings
molecularlocation
cellular location
organism-level
location
192192
The GO is a canonical The GO is a canonical representationrepresentation
““The Gene Ontology is a The Gene Ontology is a computational representation of the computational representation of the ways in which gene products ways in which gene products normally function in the biological normally function in the biological realm”realm”
Nucl. Acids Res. Nucl. Acids Res. 2006: 34.2006: 34.
193
molecule cellular component
molecular function
cellularfunction
organism-level
biological function
organism
molecular process
cellularprocess
organism-level
process
functioningsfunctionings functionings
everything here is typical
194194
The Methodology of The Methodology of AnnotationsAnnotations
Scientific curators use experimental Scientific curators use experimental observations reported in the biomedical observations reported in the biomedical literature to link gene products with GO literature to link gene products with GO terms in terms in annotationsannotations. .
The gene annotations taken together yield a The gene annotations taken together yield a slowly growing computer-interpretable map slowly growing computer-interpretable map of biological reality.of biological reality.
The process of annotating literature also leads The process of annotating literature also leads to improvements and extensions of the to improvements and extensions of the ontology, which institutes a virtuous cycle of ontology, which institutes a virtuous cycle of improvement in the quality and reach of improvement in the quality and reach of both future annotations and the ontology both future annotations and the ontology itself. itself.
195195
When we annotate the When we annotate the record of an experimentrecord of an experiment
we use terms representing we use terms representing types types to capture to capture what we learn about:what we learn about:– this experiment (instance), performed here this experiment (instance), performed here
and now, in this laboratoryand now, in this laboratory– the instances experimented upon the instances experimented upon These instances are typical = they are These instances are typical = they are representatives of types representatives of types – of experiment (described in FuGO)of experiment (described in FuGO)– of gene product molecules, molecular of gene product molecules, molecular
functions, cellular components, biological functions, cellular components, biological processes (described in GO)processes (described in GO)
196196
Experimental recordsExperimental recordsdocument a variety of instances document a variety of instances (particular real-world examples or (particular real-world examples or cases), ranging from instances of cases), ranging from instances of gene products (including individual gene products (including individual molecules) to instances of molecules) to instances of biochemical processes, molecular biochemical processes, molecular functions, and cellular locationsfunctions, and cellular locations
197197
Experimental recordsExperimental recordsprovide evidence that gene products of provide evidence that gene products of given types have molecular functions of given types have molecular functions of given types by documenting occurrences given types by documenting occurrences in the real world that involve in the real world that involve corresponding instances of functioning. corresponding instances of functioning.
They document the existence of real-world They document the existence of real-world molecules that have the potential to molecules that have the potential to execute (carry out, realize, perform) the execute (carry out, realize, perform) the types of molecular functions that are types of molecular functions that are involved in these occurrences. involved in these occurrences.
198198
GlossaryGlossary
Instance: Instance: A particular entity in spatio-A particular entity in spatio-temporal reality. temporal reality.
Type: Type: A general kind instantiated by A general kind instantiated by an open-ended totality of instances an open-ended totality of instances which share certain qualities and which share certain qualities and propensities in common of the sort propensities in common of the sort that can be documented in scientific that can be documented in scientific literatureliterature
199199
GlossaryGlossaryGene product instance: Gene product instance: A molecule A molecule
thatthat is generated by the expression is generated by the expression of a DNA sequence and which plays of a DNA sequence and which plays some significant role in the biology of some significant role in the biology of the organism. the organism.
Gene product type:Gene product type: A type of gene A type of gene product instance.product instance.
200200
GlossaryGlossaryBiological process instance (aka Biological process instance (aka
“occurrence”):“occurrence”): A change or A change or complex of changes on the level of complex of changes on the level of granularity of the cell or organism, granularity of the cell or organism, mediated by one or more gene mediated by one or more gene products.products.
Biological process type:Biological process type: A type of A type of biological process instance.biological process instance.
201201
Cellular component instance: Cellular component instance: A part of a A part of a cell, including cellular structures, cell, including cellular structures, macromolecular complexes and spatial macromolecular complexes and spatial locations identified in relation to the celllocations identified in relation to the cell
Cellular component type:Cellular component type: A type of A type of cellular component. cellular component.
GlossaryGlossary
202202
Molecular function instance: Molecular function instance: The The propensity of a gene product propensity of a gene product instance to perform actions, such as instance to perform actions, such as catalysis or binding, on the molecular catalysis or binding, on the molecular level of granularity. level of granularity.
Molecular function type: Molecular function type: A type of A type of molecular function instance. molecular function instance.
GlossaryGlossary
203203
Molecular function execution instance Molecular function execution instance (aka “functioning”): (aka “functioning”): A process instance A process instance on the molecular level of granularity that on the molecular level of granularity that is the result of the action of a gene is the result of the action of a gene product instance.product instance.
Molecular function execution type: Molecular function execution type: A A type of molecular function execution type of molecular function execution instance (aka “a type of functioning”)instance (aka “a type of functioning”)
GlossaryGlossary
204204
Should ‘activity’ be dropped Should ‘activity’ be dropped from Molecular Function from Molecular Function
terms?terms?Pro:Pro:Functions are never activities (they are Functions are never activities (they are
propensities)propensities)Many functions are never realizedMany functions are never realizedThe current remedy is ugly The current remedy is ugly The current remedy is not universally acceptable The current remedy is not universally acceptable
((structural constituent of bonestructural constituent of bone))Con:Con:Much renaming work would be needed to advance Much renaming work would be needed to advance
clarityclarity
205205
Should the Molecular Should the Molecular Function ontology be Function ontology be
renamed?renamed?ProProCould keep ‘activity’ Could keep ‘activity’ FunctioningsFunctionings are observable, are observable, functionsfunctions
are not are not The GO is interested precisely in The GO is interested precisely in
functionings (not in side effects, functionings (not in side effects, malfunctionings, accidents, stochastic malfunctionings, accidents, stochastic processes)processes)
The GO is interested in how functionings The GO is interested in how functionings contribute to biological processescontribute to biological processes
206206
Should the Molecular Should the Molecular Function ontology be Function ontology be
renamed?renamed?Biological science is marked precisely by the Biological science is marked precisely by the
dominance of the functional orientation (cf. dominance of the functional orientation (cf. classifications of functions in neuroscience) classifications of functions in neuroscience)
ConclusionConclusionKeep ‘Molecular Function’, drop ‘activity’; Keep ‘Molecular Function’, drop ‘activity’;
rename terms where necessary; but in such rename terms where necessary; but in such a way as to avoid double counting of both a way as to avoid double counting of both molecular functions and molecular molecular functions and molecular functioningsfunctionings
207207
What will be the structure What will be the structure of the OBO Foundry?of the OBO Foundry?
208
moleculecellular
component
molecular function
cellularfunction
organism-level
biological function
organism
molecular process
cellularprocess
organism-level
process
functioningsfunctionings functionings
molecularlocation
cellular locations
organism-level
locations
209
cell (types)
molecular function
(GO)
species
molecular process
cellular anatom
y
anatomy(fly, fish,
human...)
cellularphysiology
organism-levelphysiology
ChEBI,Sequence,
RNA ...
210
cell (types)
molecular function
(GO)
species
molecular process
cellular anatom
y
anatomy(fly, fish, human...)
cellularphysiology
organism-levelphysiology
ChEBI,Sequence,
RNA ...
normal(functionings)
211
pathophysiology(disease)
pathoanatomy(fly, fish, human ...)
pathological(malfunctionings)
212
cell (types)
molecular function
(GO)
species
molecular process
cellular anatom
y(GO)
anatomy(fly, fish, human...)
cellularphysiology
organism-levelphysiology
ChEBI,Sequence,
RNA ...
pathophysiology(disease)
pathoanatomy(fly, fish, human ...)
213
cell (types)
molecular function
(GO)
species
molecular process
cellular anatom
y
anatomy(fly, fish, human...)
cellularphysiology
organism-levelphysiology
ChEBI,Sequence,
RNA ...
pathophysiology(disease)
pathoanatomy(fly, fish, human ...)
phenotype
214
cell (types)
molecular function
(GO)
species
molecular process
cellular anatom
y
anatomy(fly, fish, human...)
cellularphysiology
organism-levelphysiology
ChEBI,Sequence,
RNA ...
pathophysiology(disease)
pathoanatomy(fly, fish, human ...)
phenotype
investigation(FuGO)
215215
EndEnd