enhancing ontologies through annotations training course olivier bodenreider lister hill national...
Post on 21-Dec-2015
218 views
TRANSCRIPT
![Page 1: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/1.jpg)
Enhancing OntologiesThrough Annotations
Training CourseTraining Course
Olivier BodenreiderOlivier Bodenreider
Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland - USABethesda, Maryland - USA
May 22, 2006
Schloß Dagstuhl
Biomedical OntologyBiomedical Ontology
inin
![Page 2: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/2.jpg)
2 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
OutlineOutline
Dependence relations in MeSHand co-occurrence in MEDLINE
Identifying associative relationsin the Gene Ontology
Linking the Gene Ontologyto other biological ontologies: GO-ChEBI
![Page 3: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/3.jpg)
Using Dependence Relations in MeSHas a Framework for the Analysis
of Disease Information in Medline
![Page 4: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/4.jpg)
4 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
AcknowledgmentsAcknowledgments
Lowell VizenorLowell VizenorNational Library of National Library of Medicine, USAMedicine, USA
![Page 5: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/5.jpg)
5 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Relations among biomedical entities Relations among biomedical entities
Symbolic relationsSymbolic relations Represented in biomedical terminologies/ontologiesRepresented in biomedical terminologies/ontologies Explicit semanticsExplicit semantics
Hierarchical (Hierarchical (isaisa, , part ofpart of)) Associative (Associative (location oflocation of, , causescauses, …), …)
Statistical relationsStatistical relations Represented in textRepresented in text
Among lexical items (entity recognition)Among lexical items (entity recognition) AnnotationsAnnotations
No explicit semanticsNo explicit semantics
![Page 6: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/6.jpg)
6 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Example Example Viral meningitisViral meningitis
Viral meningitis
CNS disease
isa
Infectious disease
isa
Meningeslocated in
Viruscaused by
Anitviral agentstreated byHerpesviridaeInfections
T-Lymphocytes
18
9
![Page 7: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/7.jpg)
7 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Statistical relationsStatistical relations
Crucial for text mining applicationsCrucial for text mining applications Entity recognitionEntity recognition Frequency of co-occurrenceFrequency of co-occurrence
No semanticsNo semantics Frequency of co-occurrence used as an indicator Frequency of co-occurrence used as an indicator
of the salience of the relationof the salience of the relation
![Page 8: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/8.jpg)
8 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
An example from MEDLINEAn example from MEDLINE
Hurwitz JL, Korngold R, Doherty PC.Specific and nonspecific T-cell recruitment in viral meningitis: possible implications for autoimmunity. Cell Immunol. 1983 Mar;76(2):397-401.
Specific and nonspecific T-cell invasion into cerebrospinal fluid has been investigated in the nonfatal viral meningoencephalitis induced by intracerebral inoculation of mice with vaccinia virus. At the peak of the inflammatory process on Day 7 approximately 5 to 10% of the Lyt 2+ T cells present are apparently specific for vaccinia virus. Concurrently, in mice primed previously with influenza virus, 0.5 to 1.0% of the appropriate T-cell set located in cerebrospinal fluid is reactive to influenza-infected target cells. This vaccinia virus-induced inflammatory exudate may thus contain as many as 500 influenza-immune memory T cells. These findings are discussed from the aspect that such nonspecific T-cell invasion into the central nervous system during aseptic viral meningitis could result in exposure of potentially brain-reactive T cells to central nervous system components. PMID: 6601524
BrainBrain/immunology/immunology Cytotoxicity, Immunologic Cytotoxicity, Immunologic Exudates and TransudatesExudates and Transudates/cytology/cytology Exudates and TransudatesExudates and Transudates
/immunology/immunology Meningitis, ViralMeningitis, Viral/immunology/immunology** T-LymphocytesT-Lymphocytes/immunology/immunology** Vaccinia virus Vaccinia virus
Animals Animals Humans Humans Mice Mice Research Support, Non-U.S. Gov't Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Research Support, U.S. Gov't,
P.H.S. P.H.S.
![Page 9: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/9.jpg)
9 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Ontological analysisOntological analysis
Formal ontological distinctionFormal ontological distinction Dependence relationsDependence relations
Every instance of a class is related tosome instances of another class
A is ontologically dependent on Bif and only if A exists then B exists
Contingent relationsContingent relations Only some instances of a class are related to
some instances of another class
CNS disease CNS
Viral meningitis Virus
aspirin headache
![Page 10: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/10.jpg)
10 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Statistical vs. ontologicalStatistical vs. ontological
Can we use formal ontology to help analyze statistical relations?
What is the relation between ontological and statistical relations?
Hypothesis: Correspondence between
Dependence relations(ontological)
High frequencies of co-occurrence (statistical) Dependence relations ↔ Systematically high
frequencies of co-occurrence
![Page 11: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/11.jpg)
11 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
More formal-ontological distinctionsMore formal-ontological distinctions
OxygenB-lymphocyteMitral valve
Streptococcus
Continuants Occurrents
continue to existthrough time
unfold through timein successive phases
Dependentcontinuants
Independentcontinuants
require theexistence of any other entity
in order to exist?yes no
Oxygen transportGlucose metabolismEchocardiography
Oxygen transporter
![Page 12: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/12.jpg)
12 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Application to diseasesApplication to diseases
Diseases are (mostly) processes, i.e., occurrentsDiseases are (mostly) processes, i.e., occurrents Diseases are dependent entitiesDiseases are dependent entities Diseases depend on independent continuantsDiseases depend on independent continuants
Anatomical structuresAnatomical structures Classification by “location” (body system)Classification by “location” (body system)
Agents (pathogens)Agents (pathogens) Classification by etiologyClassification by etiology
![Page 13: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/13.jpg)
13 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Participation relationParticipation relation
Participation relations are dependence relationsParticipation relations are dependence relations Between processes and biomedical continuantsBetween processes and biomedical continuants Passive participation: Passive participation: has_participanthas_participant
Viral meningitis Viral meningitis has_participanthas_participant Meninges Meninges
Active participation: Active participation: has_agenthas_agent Viral meningitis Viral meningitis has_agenthas_agent Virus Virus
Defined at the instance levelDefined at the instance levelbut can be adapted at the class levelbut can be adapted at the class level
![Page 14: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/14.jpg)
14 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Statistical relationsStatistical relations
Independent eventsIndependent events P(E1 ∩ E2) = P(E1) . P(E2)
Tests of independenceTests of independence χχ2 2 testtest GG2 2 test (likelihood ratio test)test (likelihood ratio test)
![Page 15: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/15.jpg)
15 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ObjectivesObjectives
Analyze dependence relations in MeSH and to compare them to statistical relations obtained from co-occurrence data
Restricted to the relations between disease categories and other categories of biomedical interest
Hypothesis: Co-occurrence relations between diseases and other categories
Highest proportion for the dependent category, systematically across diseases
Smaller proportions for other non-dependent categories
![Page 16: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/16.jpg)
Materials
![Page 17: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/17.jpg)
17 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Medical Subject Headings (MeSH)Medical Subject Headings (MeSH)
Controlled vocabulary used to index MEDLINEControlled vocabulary used to index MEDLINE 22,658 descriptors (2004 version)22,658 descriptors (2004 version) 16 tree-like hierarchies16 tree-like hierarchies
AnatomyAnatomy OrganismsOrganisms DiseasesDiseases ……
![Page 18: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/18.jpg)
18 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
MEDLINEMEDLINE
385,491 citations (year 2004)385,491 citations (year 2004) Indexed with 20,085 distinct MeSH descriptors
Restrictions Starred descriptors only (3.5 / citation, on average)Starred descriptors only (3.5 / citation, on average) Frequency of co-occurrence Frequency of co-occurrence ≥≥ 10 10 Associations between diseases and other categoriesAssociations between diseases and other categories
![Page 19: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/19.jpg)
Methods and Results
![Page 20: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/20.jpg)
![Page 21: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/21.jpg)
21 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying dependence relationsIdentifying dependence relations
Manual examination of the 23 top-level disease Manual examination of the 23 top-level disease categories [C tree]categories [C tree] ExceptionsExceptions
Pathological conditions, signs and symptomsPathological conditions, signs and symptoms (C23) (C23)
AdditionsAdditions Mental disordersMental disorders (F03) (F03)
Identify the categories in (active or passive) Identify the categories in (active or passive) participation relation with the processparticipation relation with the process
![Page 22: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/22.jpg)
22 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying dependence relations Identifying dependence relations ResultsResults
has_participanthas_participant
![Page 23: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/23.jpg)
23 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying dependence relations Identifying dependence relations ResultsResults
has_agenthas_agent
![Page 24: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/24.jpg)
24 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relationsIdentifying statistical relations
Aggregation at the level of top-level categoriesAggregation at the level of top-level categories
Viral meningitis (C02, C10)
Meninges (A08)
(C02, A08)and
(C10, A08)
![Page 25: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/25.jpg)
25 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relationsIdentifying statistical relations
Contingency tableContingency table
Testing independenceTesting independence GG2 2 test (likelihood ratio test)test (likelihood ratio test)
YesYes NoNo
YesYes nnABAB nnAbAb
NoNo nnaBaB nnabab
Indexed withterm B
Indexed withterm A
![Page 26: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/26.jpg)
26 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relations Identifying statistical relations ResultsResults
Quantitative resultsQuantitative results 25,376 pairs of co-occurring descriptors25,376 pairs of co-occurring descriptors All but 68 of these statistically significant (All but 68 of these statistically significant (GG22 test) test) 7,896 pairs with frequency of co-occurrence 7,896 pairs with frequency of co-occurrence ≥≥ 10 10 6,525 between diseases and other categories6,525 between diseases and other categories
![Page 27: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/27.jpg)
27 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relations Identifying statistical relations ResultsResults
23 disease top-level categories
88 o
ther
top-
leve
l cat
egor
ies
45.75
![Page 28: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/28.jpg)
28 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relations Identifying statistical relations ResultsResults
Qualitative results (1)Qualitative results (1) Generally one top-level category of the Generally one top-level category of the AnatomyAnatomy and and
OrganismsOrganisms trees accounting for the highest frequency trees accounting for the highest frequency of co-occurrence for a given diseaseof co-occurrence for a given disease
Cardiovascular Diseases Cardiovascular System
ExceptionsExceptions Neoplasms [C04] Congenital, Hereditary, and Neonatal Diseases and
Abnormalities [C16] Endocrine Diseases [C19] Immunologic Diseases [C20]
![Page 29: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/29.jpg)
29 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Identifying statistical relations Identifying statistical relations ResultsResults
Qualitative results (2)Qualitative results (2) Most Anatomy and Organisms categories are
preferentially associated with one disease category Cardiovascular System Cardiovascular Diseases
Categories other than Categories other than Anatomy and Organisms tend not tend not to be associated with to be associated with one particular disease category (contingent rather than dependent relations)
Pathological Conditions, Signs and Symptoms [C23] Amino Acids, Peptides, and Proteins [D12] Diagnosis [E01] Therapeutics [E02] Surgical Procedures, Operative [E04]
![Page 30: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/30.jpg)
Discussion
![Page 31: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/31.jpg)
31 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ApplicationsApplications
To semantic miningTo semantic mining Formal ontological analysis of relations provides a Formal ontological analysis of relations provides a
useful framework for elucidating statistical associationsuseful framework for elucidating statistical associations
To terminology creation and maintenanceTo terminology creation and maintenance Most terminologies do not represent trans-ontological Most terminologies do not represent trans-ontological
relations explicitlyrelations explicitly Concepts in dependence relation should not be Concepts in dependence relation should not be
modified independently of each othermodified independently of each other
![Page 32: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/32.jpg)
32 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
SummarySummary
We have studied statistical associations between We have studied statistical associations between MeSH terms co-occurring in MEDLINE citationsMeSH terms co-occurring in MEDLINE citations
We have shown that the ontological relation of We have shown that the ontological relation of dependence is generally corroborated by a strong, dependence is generally corroborated by a strong, systematic statistical associationsystematic statistical association
These techniquesThese techniques Provide a framework for semantic mining of diseasesProvide a framework for semantic mining of diseases Can help maintain terminologiesCan help maintain terminologies
![Page 33: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/33.jpg)
33 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ReferencesReferences
Vizenor L, Bodenreider O. Vizenor L, Bodenreider O. Using dependence relations in MeSH as a Using dependence relations in MeSH as a framework for the analysis of disease information in Medlineframework for the analysis of disease information in Medline. . Proceedings of the Second International Symposium on Semantic Proceedings of the Second International Symposium on Semantic Mining in Biomedicine (SMBM-2006) 2006:76-83.Mining in Biomedicine (SMBM-2006) 2006:76-83.http://mor.nlm.nih.gov/pubs/pdf/2006-smbm-lv.pdfhttp://mor.nlm.nih.gov/pubs/pdf/2006-smbm-lv.pdf
![Page 34: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/34.jpg)
Non-lexical Approaches to IdentifyingAssociative Relations in the Gene Ontology
![Page 35: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/35.jpg)
35 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
AcknowledgmentsAcknowledgments
Marc AubryMarc AubryUMR 6061 CNRS,UMR 6061 CNRS,Rennes, FranceRennes, France
Anita BurgunAnita BurgunUniversity of University of Rennes, FranceRennes, France
![Page 36: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/36.jpg)
36 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Gene OntologyGene Ontology
Annotate gene productsAnnotate gene products CoverageCoverage
Molecular functionsMolecular functions Cellular componentsCellular components Biological processesBiological processes
Explicit relations to other terms within the same Explicit relations to other terms within the same hierarchyhierarchy
No (explicit) relationsNo (explicit) relations To terms across hierarchiesTo terms across hierarchies To concepts from other biological ontologiesTo concepts from other biological ontologies
![Page 37: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/37.jpg)
37 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Gene OntologyGene Ontology
Molecularfunctions
Cellularcomponents
Biologicalprocesses
BP: metal ion transportMF: metal ion transporter activity
![Page 38: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/38.jpg)
38 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
MotivationMotivation
Richer ontologyRicher ontology Beyond hierarchiesBeyond hierarchies
Easier to maintainEasier to maintain Explicit dependence relationsExplicit dependence relations
More consistent annotationsMore consistent annotations Quality assurance Quality assurance Assisted curationAssisted curation
![Page 39: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/39.jpg)
39 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Related workRelated work
Ontologizing GOOntologizing GO GONGGONG
Identifying relations among GO terms across hierarchiesIdentifying relations among GO terms across hierarchies Lexical approachLexical approach Non-lexical approachesNon-lexical approaches
Identifying relations between GO terms and OBO termsIdentifying relations between GO terms and OBO terms ChEBIChEBI
Representing relations among GO terms and between GO Representing relations among GO terms and between GO terms and OBO termsterms and OBO terms ObolObol
See also:See also:
[Ogren & al., PSB 2004-2005]
[Bodenreider & al., PSB 2005]
[Wroe & al., PSB 2003]
[Mungall, CFG 2005]
[Burgun & al., SMBM 2005]
[Bada & al., 2004], [Kumar & al., 2004], [Dolan & al., 2005]
![Page 40: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/40.jpg)
40 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
GO and annotation databasesGO and annotation databases
5 model organisms5 model organisms FlyBaseFlyBase GOA-HumanGOA-Human MGIMGI SGDSGD WormBaseWormBase
Brca1Brca1 GO:0000793GO:0000793 IDAIDA
Brca1Brca1 GO:0003677GO:0003677 IEAIEA
Brca1Brca1 GO:0003684GO:0003684 IDAIDA
Brca1Brca1 GO:0003723GO:0003723 ISSISS
Brca1Brca1 GO:0004553GO:0004553 ISSISS
Brca1Brca1 GO:0005515GO:0005515 ISS, TASISS, TAS
Brca1Brca1 GO:0005622GO:0005622 IEAIEA
Brca1Brca1 GO:0005634GO:0005634 IDA, ISSIDA, ISS
Brca1Brca1 ……270,000 gene-term associations270,000 gene-term associations
![Page 41: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/41.jpg)
41 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Three non-lexical approachesThree non-lexical approaches
All based on annotation databasesAll based on annotation databases
Similarity in the vector space modelSimilarity in the vector space model
Statistical analysis of co-occurring GO termsStatistical analysis of co-occurring GO terms
Association rule miningAssociation rule mining
![Page 42: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/42.jpg)
42 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Similarity in the vector space modelSimilarity in the vector space model
Annotationdatabase
Annotationdatabase
GO
term
s
Genesg1 g2 … gn
t1
t2
…
tn
GO terms
Gen
es
g1
g2
…
gn
t1 t2 … tn
![Page 43: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/43.jpg)
43 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Similarity in the vector space modelSimilarity in the vector space model
GO terms
GO
term
s t1
t2
…
tn
t1 t2 … tn
Similaritymatrix
Similaritymatrix
Sim(ti,tj) = ti · tj
GO
term
s
Genesg1 g2 … gn
t1
t2
…
tn
tj
ti
![Page 44: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/44.jpg)
44 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Analysis of co-occurring GO termsAnalysis of co-occurring GO terms
Annotationdatabase
Annotationdatabase
GO terms
Gen
es
g1
g2
…
gn
t1 t2 … tn
g2
t2 t7 t9
… t3 t7 t9
g5
t5
tt22-t-t77 11
tt22-t-t99 11
tt77-t-t99 22
……
tt55 11
tt77 22
tt99 22
……
![Page 45: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/45.jpg)
45 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Analysis of co-occurring GO termsAnalysis of co-occurring GO terms
Statistical analysis: test independenceStatistical analysis: test independence Likelihood ratio test (GLikelihood ratio test (G22)) Chi-square test (Pearson’s Chi-square test (Pearson’s χχ22))
Example from GOA (Example from GOA (22,72022,720 annotations) annotations) C0006955 [BP]C0006955 [BP] Freq. = Freq. = 588588 C0008009 [MF]C0008009 [MF] Freq. = Freq. = 5353
presentpresent absentabsent TotalTotal
presentpresent 4646 542542 588588
absentabsent 77 21,58321,583 22,13222,132
totaltotal 5353 22,12522,125 22,72022,720
GO:0008009 immune responseimmune response
GO:0006955
chemokinechemokineactivityactivity
Co-oc. = Co-oc. = 4646
GG22 = 298.7 = 298.7p < 0.000p < 0.000
![Page 46: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/46.jpg)
46 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Association rule miningAssociation rule mining
Annotationdatabase
Annotationdatabase
GO terms
Gen
es
g1
g2
…
gn
t1 t2 … tn
g2
t2 t7 t9
transaction
apriori• Rules: t1 => t2
• Confidence: > .9• Support: .05
![Page 47: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/47.jpg)
47 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Examples of associationsExamples of associations
Vector space modelVector space model MF: ice binding BP: response to freezing
Co-occurring terms MF: chromatin binding CC: nuclear chromatin
Association rule mining MF: carboxypeptidase A activity BP: peptolysis and peptidolysis
![Page 48: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/48.jpg)
48 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Associations identifiedAssociations identified
VSM COC ARM LEX
MF-CC 499 893 362 917
MF-BP 3057 1628 577 2523
CC-BP 760 1047 329 2053
Total 4316 3568 1268 5493
7665 by at least one approach
![Page 49: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/49.jpg)
49 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Associations identified Associations identified VSMVSM
VSM COC ARM LEX
MF-CC 499 893 362 917
MF-BP 3057 1628 577 2523
CC-BP 760 1047 329 2053
Total 4316 3568 1268 5493
MF: ice bindingBP: response to freezing
![Page 50: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/50.jpg)
50 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Associations identified Associations identified COCCOC
VSM COC ARM LEX
MF-CC 499 893 362 917
MF-BP 3057 1628 577 2523
CC-BP 760 1047 329 2053
Total 4316 3568 1268 5493
MF: chromatin bindingCC: nuclear chromatin
![Page 51: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/51.jpg)
51 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Associations identified Associations identified ARMARM
VSM COC ARM LEX
MF-CC 499 893 362 917
MF-BP 3057 1628 577 2523
CC-BP 760 1047 329 2053
Total 4316 3568 1268 5493
MF: carboxypeptidase A activityBP: peptolysis and peptidolysis
![Page 52: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/52.jpg)
52 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Limited overlap among approachesLimited overlap among approaches
Lexical vs. non-lexicalLexical vs. non-lexical Among non-lexicalAmong non-lexical
7665 5493
230
VSM COC
ARM
3689 2587
453
305 309
121201
![Page 53: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/53.jpg)
53 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ReferencesReferences
Bodenreider O, Aubry M, Burgun A. Bodenreider O, Aubry M, Burgun A. Non-lexical approaches to Non-lexical approaches to identifying associative relations in the Gene Ontologyidentifying associative relations in the Gene Ontology. In: Altman RB, . In: Altman RB, Dunker AK, Hunter L, Jung TA, Klein TE, editors. Pacific Dunker AK, Hunter L, Jung TA, Klein TE, editors. Pacific Symposium on Biocomputing 2005: World Scientific; 2005. p. 91-Symposium on Biocomputing 2005: World Scientific; 2005. p. 91-102.102.http://mor.nlm.nih.gov/pubs/pdf/2005-psb-ob.pdfhttp://mor.nlm.nih.gov/pubs/pdf/2005-psb-ob.pdf
![Page 54: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/54.jpg)
Linking the Gene Ontologyto other biological ontologies
![Page 55: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/55.jpg)
55 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
AcknowledgmentsAcknowledgments
Anita BurgunAnita BurgunUniversity of University of Rennes, FranceRennes, France
![Page 56: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/56.jpg)
56 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Related domainsRelated domains
OrganismsOrganisms Cell typesCell types Physical entitiesPhysical entities
Gross anatomyGross anatomy MoleculesMolecules
FunctionsFunctions Organism functionsOrganism functions Cell functionsCell functions
PathologyPathology
cytosolic ribosome (sensu Eukaryota)
T-cell activation
brain development
transferrin receptor activity
visual perception
T-cell activation
regulation of blood pressure
![Page 57: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/57.jpg)
57 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
GO and other domainsGO and other domains
[adapted from B. Smith]
Physical Physical entityentity
FunctionFunction ProcessProcess
OrganismOrganismGross Gross anatomyanatomy
Organism Organism functionsfunctions
Organism Organism processesprocesses
CellCellCellular Cellular componentscomponents
Cellular Cellular functionsfunctions
Cellular Cellular processesprocesses
MoleculeMolecule MoleculesMoleculesMolecular Molecular functionsfunctions
Molecular Molecular processesprocesses
![Page 58: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/58.jpg)
58 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
GO and other domains (revisited)GO and other domains (revisited)
[adapted from B. Smith]
ResolutionResolutionPhysical Physical wholewhole
PhysicalPhysicalpartpart
FunctionFunction ProcessProcess
OrganismOrganismOrganism Organism componentscomponents
Organism Organism functionsfunctions
Organism Organism processesprocesses
CellCellCellular Cellular componentscomponents
Cellular Cellular functionsfunctions
Cellular Cellular processesprocesses
MoleculeMoleculeMolecular Molecular componentscomponents
Molecular Molecular functionsfunctions
Molecular Molecular processesprocesses
![Page 59: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/59.jpg)
59 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Biological ontologies (OBO)Biological ontologies (OBO)
![Page 60: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/60.jpg)
60 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
GO and other domains (revisited)GO and other domains (revisited)
[adapted from B. Smith]
ResolutionResolutionPhysical Physical wholewhole
PhysicalPhysicalpartpart
FunctionFunction ProcessProcess
OrganismOrganismOrganism Organism componentscomponents
Organism Organism functionsfunctions
Organism Organism processesprocesses
CellCellCellular Cellular componentscomponents
Cellular Cellular functionsfunctions
Cellular Cellular processesprocesses
MoleculeMoleculeMolecular Molecular componentscomponents
Molecular Molecular functionsfunctions
Molecular Molecular processesprocesses
![Page 61: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/61.jpg)
61 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Integrating biological ontologiesIntegrating biological ontologies
Cellularcomponents
Cellularcomponents
Molecularfunctions
Molecularfunctions
MoleculeMolecule
Biologocalprocesses
Biologocalprocesses
OrganismOrganism
Other OBOontologies
Other OBOontologies
CellCell
![Page 62: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/62.jpg)
Linking GO to ChEBI
![Page 63: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/63.jpg)
63 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ChEBIChEBI
Member of the OBO familyMember of the OBO family Ontology ofOntology of
ChChemical emical EEntities of ntities of BBiological iological IInterestnterest AtomAtom MoleculeMolecule IonIon RadicalRadical
10,516 entities10,516 entities 27,097 terms27,097 terms [Dec. 22, 2004]
![Page 64: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/64.jpg)
64 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
MethodsMethods
Every ChEBI term searched in every GO termEvery ChEBI term searched in every GO term Maximize precisionMaximize precision
Ignored ChEBI terms of 3 characters or lessIgnored ChEBI terms of 3 characters or less Proper substringProper substring
Maximize recallMaximize recall Case insensitive matchesCase insensitive matches Normalized ChEBI namesNormalized ChEBI names
(generated singular forms from plurals)(generated singular forms from plurals)
![Page 65: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/65.jpg)
65 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ExamplesExamples
iron [CHEBI:18248]
uronic acid [CHEBI:27252]
carbon [CHEBI:27594]
BP iron ion transport [GO:0006826]
MF iron superoxide dismutase activity [GO:0008382]
CC vanadium-iron nitrogenase complex [GO:0016613]
BP response to carbon dioxide [GO:0010037]
MF carbon-carbon lyase activity [GO:0016830]
BP uronic acid metabolism [GO:0006063]
MF uronic acid transporter activity [GO:0015133]
![Page 66: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/66.jpg)
66 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Quantitative resultsQuantitative results
2,700 ChEBI entities 2,700 ChEBI entities (27%) identified in some (27%) identified in some GO termGO term
9,431 GO terms (55%) 9,431 GO terms (55%) include some ChEBI include some ChEBI entity in their namesentity in their names
10,516 entities 17,250 terms
20,497 associations
![Page 67: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/67.jpg)
67 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
GeneralizationGeneralization
MFCC BP
CHEBI
FMA
Cell types
REX FIX
MousePathology
cellmembraneviscosity
enzymaticreaction
Leydic cell tumor
irondeposition
cerebellar aplasia
![Page 68: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/68.jpg)
Conclusions
![Page 69: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/69.jpg)
69 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Conclusions (1)Conclusions (1)
Links across OBO ontologies need to be made Links across OBO ontologies need to be made explicitexplicit Between GO terms across GO hierarchiesBetween GO terms across GO hierarchies Between GO terms and OBO termsBetween GO terms and OBO terms Between terms across OBO ontologiesBetween terms across OBO ontologies
Automatic approachesAutomatic approaches Effective (GO-GO, GO-ChEBI)Effective (GO-GO, GO-ChEBI) At least to bootstrap the processAt least to bootstrap the process Needs to be refinedNeeds to be refined
![Page 70: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/70.jpg)
70 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
Conclusions (2)Conclusions (2)
Affordable relationsAffordable relations Computer-intensive, not labor-intensiveComputer-intensive, not labor-intensive
Methods must be combinedMethods must be combined Cross-validationCross-validation Redundancy as a surrogate for reliabilityRedundancy as a surrogate for reliability Relations identified specifically by one approachRelations identified specifically by one approach
False positivesFalse positives Specific strength of a particular methodSpecific strength of a particular method
Requires (some) manual curationRequires (some) manual curation Biologists must be involvedBiologists must be involved
![Page 71: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/71.jpg)
71 Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications
ReferencesReferences
Burgun A, Bodenreider O. Burgun A, Bodenreider O. An ontology of chemical entities helps An ontology of chemical entities helps identify dependence relations among Gene Ontology termsidentify dependence relations among Gene Ontology terms. . Proceedings of the First International Symposium on Semantic Mining Proceedings of the First International Symposium on Semantic Mining in Biomedicine (SMBM-2005)in Biomedicine (SMBM-2005)Electronic proceedings: CEUR-WS/Vol-148Electronic proceedings: CEUR-WS/Vol-148http://mor.nlm.nih.gov/pubs/pdf/2005-smbm-ab.pdfhttp://mor.nlm.nih.gov/pubs/pdf/2005-smbm-ab.pdf
![Page 72: Enhancing Ontologies Through Annotations Training Course Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland](https://reader035.vdocuments.us/reader035/viewer/2022062714/56649d615503460f94a435bc/html5/thumbnails/72.jpg)
MedicalOntologyResearch
Olivier BodenreiderOlivier Bodenreider
Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland - USABethesda, Maryland - USA
Contact:Contact:Web:Web:
[email protected]@nlm.nih.govmor.nlm.nih.govmor.nlm.nih.gov