w3c life science ontology issues session on triples and ontologies
TRANSCRIPT
W3C Life Science Ontology Issues
Session on Triples and Ontologies
Why the Tower of Babel Exists Biomedical science is a bottom up enterprise
Efficiency of competitive systems Multiple independent discovery
External enabling technology and knowledge
Tension between dissemination and control Fundamental desire to be cited Fundamental need to control intellectual property
Implicit citation through nomenclature If you are using my name, you are citing my
discovery
Name Space Collisions
Molecular biology has an extraordinarily complex vocabulary Many terms with highly specific meanings that are used rarely Plasmid, pUC13, M13, cosmid, fosmid, yac, bac, pac, …
All cloning vectors, each with specific properties and uses High information content per word
Compression through acronyms => collisions across domains PCR
Polymerase Chain Reaction Historically, MeSH indexed PCR as an abbreviation for “premature contraction” in
cardiology Phosphocreatine in metabolism and physiology
Specific definitions with high information content Association
Generally a rather vague relationship In statistical genetics, a precisely defined criteria implying that specific tests for
significance have been met.
Biomedical Text is Not “Well Classifiable” Classifiable domain
Well defined robust classes Class definitions ~robust to algorithms and metrics
Poorly classifiable domains Class boundaries not clear, class definitions not robust Really just saying the best classification is one big class
Biomedical text is a web, not a collection of well defined domain specific corpuses Is an article about P53 molecular biology, gene expression
regulation or cancer biology?
Probabilistic Nature of Biomedical Knowledge Bayes rule
I know what I have observed I can only probabilistically rank hypotheses Understanding evolves as more data becomes
available Language links to understanding
As the understanding evolves, the meaning of the language evolves
Ask 3 biologists to define a gene and you will get 5 definitions and 2 dissenting opinions
Questions for Ontologies Session How to represent probabilistic concepts and
meanings with logically precise standards? How do we associate the appropriate domain
specific ontology(ies) with the text we are analyzing?
How do we create sustainable merges across evolving domain specific ontologies?