the ontology of the gene ontology
Post on 30-Dec-2015
58 Views
Preview:
DESCRIPTION
TRANSCRIPT
The Ontology of the Gene Ontology
Barry Smithhttp://ifomis.de
Jennifer Williamshttp://ontologyworks.com
Steffen Schulze-Kremerhttp://ifomis.de
http:// ifomis.de 2
The Prime Directive
As the right of each sentient species to live in accordance with its normal cultural evolution is considered sacred, no Star Fleet personnel may interfere with the healthy development of alien life and culture. Such interference includes the introduction of superior knowledge, strength, or technology to a world whose society is incapable of handling such advantages wisely.
http:// ifomis.de 3
The Bioinformatics Prime Directive
no computer scientist may interfere with the information resources provided by biologists
http:// ifomis.de 4
The Story of GONG
Computer scientists develop
browsers,
query-interfaces,
tools for statistical analysis or for cross-ontology mapping
which take the biological information as something inviolable
http:// ifomis.de 5
IFOMIS: Renegade StarTroop
Institute for Formal Ontology and Medical Information Science
Faculty of Medicine
University of Leipzig
http:// ifomis.de
http:// ifomis.de 6
The Gene Statistic
The Gene Ontology
http:// ifomis.de 7
GO: the Gene Ontology
3 large telephone directories of standardized designations for gene functions and products
designed to cover the whole of biology
model for
fungal ontology,
plant ontology,
drosophila ontology,
etc.
http:// ifomis.de 8
Primary aim of GO
not rigorous definition and principled classification
but rather: providing a practically useful framework for keeping track of the biological annotations that are applied to gene products
Thesis: GO can realize its goal more adequately (and avoid many coding errors) by taking ontology (especially the logic of classifications and definitions) seriously
http:// ifomis.de 9
GO: the Gene Ontology
GO divided into 3 separate hierarchies each organized via is_a and part_of
http:// ifomis.de 10
Problems with is_a
A is_a B = every instance of A is an instance of B
http:// ifomis.de 11
Problems with is_a
Holliday junction helicase complex is_a
unlocalized
protein storage vacuole is_a
vacuole (sensu Streptophyta)
http:// ifomis.de 12
Problems with part_of
‘part_of’ = ‘can be part of’ (flagellum part_of cell)
‘part_of’ = ‘is sometimes part of’ (replication fork part_of the nucleoplasm)
‘part_of’ = ‘is included as a sublist in’
http:// ifomis.de 13
GO divided into three disjoint term hierarchies
cellular component ontology
molecular function ontology
biological process ontology
flagellum, chromosome, cell
ice nucleation, binding, protein stabilization
glycolysis, death
http:// ifomis.de 14
three separate hierarchies
= no is_a and no part_of relations defined between them
PUZZLE: How are the classes in the three separate hierarchies linked together?
cellular component ontology
molecular function ontology
biological process ontology
http:// ifomis.de 15
Component
Component is easy to understand:
A component is a 3-dimensional entity which endures through time
http:// ifomis.de 16
Process
Process is easy to understand:
A process is an occurrent entity = an entity which unfolds itself through time in successive temporal parts
http:// ifomis.de 17
What is a function?
http:// ifomis.de 18
Definition of «Function»
UMLS Semantic Network:
Functional Concept =df A concept which is of interest because it pertains to the carrying out of a process or activity.
GO:
Molecular Function =df the action characteristic of a gene product.
http:// ifomis.de 19
How are the 3 ontologies related?
Function = “the action characteristic of a gene product”
Process = “phenomenon marked by changes that lead to a particular result, mediated by one or more gene products”
NO PART-WHOLE RELATIONS BETWEEN FUNCTION AND PROCESS ONTOLOGIES
http:// ifomis.de 20
The True Story about Process and Function
A process is an occurrent entity
A component is a continuant entity
http:// ifomis.de 21
The True Story about Function and Process
A process is an occurrent entity
A component is an independent continuant entity
There are also dependent continuant entities:
qualities, roles, dispositions, powers …
and functions
http:// ifomis.de 22
The function of your heart is: to pump blood
This function endures through time and gets exercised.
This function exists even when it is not being exercised
The exercise of a function is a process
http:// ifomis.de 23
Functions exist even when they are not being expressed
Functions exist even when there is no functioning
http:// ifomis.de 24
Constitiuent-Process-Function
Processes depend on constituents
Processes realize functions
Constituents have functions
http:// ifomis.de 25
Dependent continuants are realized through occurrent processes
the exercise of a function
the performance of a role
the execution of a plan
the application of a therapy
the realization of a disposition
the course of a disease
http:// ifomis.de 26
GO:
“A biological process is accomplished via one or more ordered assemblies of molecular functions.”
http:// ifomis.de 27
But no:
“GO molecular functions are occurrent rather than continuant. The terminology we've used to date is, I agree, confusing but the activities described in the molecular function ontology are events -- they represent the function as it is exercised rather than the potential to exercise that function.”
http:// ifomis.de 28
“The defintions you cite are certainly inconsistent with this at the moment, but this is a temporary situation. … true path violations … do crop up fairly regularly, but are always fixed.”
http:// ifomis.de 29
Confusion of Function and Activity
If function = activity (= functioning)
how can GO deal with dormant/suppressed functions?
How can GO deal with the relation of expression which involves a function and its exercise?
http:// ifomis.de 30
A step towards clarity
On March 2003 (nearly) all nodes in the Molecular Function ontology (except the root) had ‘activity’ added to their names
Function = activity
How does ‘process’ relate to ‘activity’
http:// ifomis.de 31
GO’s answer
“A biological process is accomplished via one or more ordered assemblies of molecular functions.”
BUT: there are no part-whole relations across ontologies
Result: constant coding errors resulting from lack of clear principles as concerns what the basic notions of ‘function’ and ‘process’ mean
http:// ifomis.de 32
Examples of GO Molecular Functions
anti-coagulant activity (defined as: “a substance that retards or prevents coagulation”)
enzyme activity (defined as: “a substance that catalyzes”)
structural molecule (defined as: “the action of a molecule that contributes to structural integrity”)
http:// ifomis.de 33
GO:0005199: structural constituent of cell wall
Definition: The action of a molecule that contributes to the structural integrity of a cell wall.
confuses constituents with actions, which GO includes in its function ontology.
http:// ifomis.de 34
extracellular matrix structural constituent + puparial glue (sensu Diptera) structural constituent of bonestructural constituent of chorion (sensu Insecta) structural constituent of chromatin structural constituent of cuticle + structural constituent of cytoskeleton structural constituent of epidermis + structural constituent of eye lens structural constituent of muscle structural constituent of myelin sheath structural constituent of nuclear pore structural constituent of peritrophic membrane (sensu
Insecta) structural constituent of ribosome structural constituent of tooth enamel structural constituent of vitelline membrane (sensu
Insecta)
http:// ifomis.de 35
Problems caused by lack of intuitive formal understandings of
its basic ontological terms
The need for expert knowledge places severe obstacles in the way of using GO as a basis for computer applications
computers do not have access to expert biological knowledge
http:// ifomis.de 36
As GO increases in size and scope
it will “be increasingly difficult to maintain the semantic consistency we desire without software tools that perform consistency checks and controlled updates”.
The addition of each new term will require the curator to understand the entire structure of GO in order to avoid redundancy and to ensure that all appropriate linkages are made with other terms.
http:// ifomis.de 37
Benefits of the GO Approach
1) Work on populating GO could start immediately, without its authors needing to solve some of the intricate problems which face ontologies when formalized as logical theories.
2) Populating GO does not require the completion of complex protocols of formally determined steps but can be done intuitively by the expert biologist.
3) There are few formal constraints standing in the way of easy incorporation of existing controlled vocabularies from the biological domain.
http:// ifomis.de 38
Drawbacks
1) It is unclear what kinds of reasoning are permissible on the basis of GO’s hierarchies.
2) The rationale of GO’s subclassifications is unclear.
3) No procedures are offered by which GO can be validated.
4) There are insufficient rules for determining how to recognize whether a given concept is or is not present in GO.
http:// ifomis.de 39
GO DOES NOT COMPUTE
Solution:
Rebuild from scratch before it is too late
MANGO
top related