chemical ontologies i533 seminar 21.feb.2006 kent holaday
TRANSCRIPT
Chemical Ontologies
I533 Seminar
21.Feb.2006
Kent Holaday
Overview
• Define Ontology
• Knowledge Representations
• Applications of Ontologies
• Chemical Ontologies
Ontology Defined
Merriam-Webster Online Dictionary1 : a branch of metaphysics concerned with the nature and relations of being2 : a particular theory about the nature of being or the kinds of existents
Ontology Defined (cont.)
Google Definitions on the web• An ontology is a controlled vocabulary that describes objects
and the relations between them in a formal way, and has a grammar for using the vocabulary terms to express something meaningful within a specified domain of interest. Source: members.optusnet.com.au/~webindexing/Webbook2Ed/glossary.htm
• Ontology is the newest label attached to some KOSs. Ontologies are being developed as specific concept models by the Knowledge Management community. They can represent complex relationships between objects, and include the rules and axioms missing from semantic networks. Ontologies that describe knowledge in a specific area are often connected with systems for data mining and knowledge management.Source: www.und.nodak.edu/dept/library/Departments/abc/SACSEM-SemInGlossary.htm
Knowledge Representations
INFORMAL• Tagging• Folksonomies
Examples• del.icio.us• flickr• Yahoo My Web 2.0
FORMAL• Lists• Thesauri• Taxonomies• Ontologieshttp://www.biowisdom.com/ontology/faq_q1.htm
Examples• IUPAC, MeSH, LCSH,
XML schema/DTD
IUPAC Nomenclature
• Compendium of Chemical TerminologycarbonElement number 6 of the periodic table of elements (electronic ground state1s2 2s2 2p2).For a description of the various types of carbon as a solid the term carbonshould be used only in combination with an additional noun or a clarifyingadjective.See also amorphous carbon, carbon fibres, carbon material, glasslikecarbon, graphitic carbon, non-graphitic carbon, pyrolytic carbon.1995, 67, 479
• Nomenclature of Inorganic Compounds• Nomenclature of Organic Chemistry
http://www.iupac.org/publications/books/seriestitles/nomenclature.html
MeSH Tree Structures
2006
1. Anatomy [A] 2. Organisms [B] 3. Diseases [C] 4. Chemicals and Drugs [D]
oInorganic Chemicals [D01] + oOrganic Chemicals [D02] + oHeterocyclic Compounds [D03] + oPolycyclic Compounds [D04] + oMacromolecular Substances [D05] + oHormones, Hormone Substitutes, and Hormone Antagonists [D06] + oEnzymes and Coenzymes [D08] + oCarbohydrates [D09] + oLipids [D10] + oAmino Acids, Peptides, and Proteins [D12] + oNucleic Acids, Nucleotides, and Nucleosides [D13] + oComplex Mixtures [D20] + oBiological Factors [D23] + oBiomedical and Dental Materials [D25] + oPharmaceutical Preparations [D26] + oChemical Actions and Uses [D27] +
5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E] 6. Psychiatry and Psychology [F] 7. Biological Sciences [G] 8. Physical Sciences [H] 9. Anthropology, Education, Sociology and Social Phenomena [I] 10. Technology and Food and Beverages [J] 11. Humanities [K] 12. Information Science [L] 13. Persons [M] 14. Health Care [N] • Publication Characteristics [V] 1. Geographic Locations [Z]
Caffeine
CAS: 58-08-2
C8H10N4O2
Synonyms:• 3,7-dihydro-1,3,7-trimethyl-1H-purine-2,6-dione• Methyltheobromin• guaranine
Visual vs Linguistic
Source: Krallinger, M. et al. (2005) Text-mining approaches in molecular biology and biomedicine. DDT 10(6) 440
Data Sources
Structured• Medline• SwissProt• ChemID Plus• Medline Plus• Chemical Abstracts• NCBI databases• Misc. databases
Unstructured• Text documents• Journal articles• Lab notebooks• Web pages• Database BLOBs• Email
Source: Gardner, S. (2005) Ontologies and semantic data integration. DDT 10(14) 1004
Semantic Web
Figure 1: The Semantic Web "layer cake" as presented by Tim Berners-Lee.
Source: Hendler, J. (2001) Agents and the semantic web. http://www.cs.umd.edu/users/hendler/AgentWeb.html
Chemical Ontology
• Describe chemical objects and relationships• Enable the search across multiple data sources• Bridge some of the graphical versus linguistic
representations
two-ring heterocyclic compounds isoquinolines isoquinoline alkaloids morphinans morphine
grouped_by_chemistry
Fragment of Chemical Ontology
molecules organic molecules heterocyclic compounds bridged-ring heterocyclic compounds morphinans morphineIsA
O
N
OH
OH
CH3
H
NHH
morphine
morphinan
IsA
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
• Chemical Entities of Biological Interest – an EBI database/dictionary of ‘biochemical compounds’ and other chemical entities of biochemical interest with an associated ontology
• ChEBI’s goal is to provide standard terminology of (bio)chemical compounds that should finally be used in biological databases
ChEBI: What is it?
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
Relationships in ChEBI ontology
To be implemented…• IsPartOf - group to molecule; group to group; group to
class• IsEnantiomerOf - molecule to molecule; cycles allowed• IsTautomerOf - molecule to molecule; cycles allowed• IsConjugateBaseOf/IsConjugateAcidOf - molecule
to molecule (e.g. anion to acid)• IsParentHydrideOf - molecule to molecule (later?)
Current• IsA : inherited from Chemical Ontology - class to
class; instance to class
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
CH3
O
NH2H
O
OHCH2
OH
NH2H
O
O-
CH3
O
NH2H
O
O-
CH2
OH
NH2H
O
OH
CH3
O
H NH2
O
OHCH2
OH
H NH2
O
O- CH2
OH
H NH2
O
OH CH3
O
H NH2
O
O-
L-Amino acid
D-Amino acid
Amino acid
CO2H
OH =O
NH2 CO2¯
is_a is_part_of is_enantiomer_of
is_conjugate_base_of is_tautomer_of
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
Source: http://www.cse.buffalo.edu/~rapaport/663/F03/ontology.html