w3c life science ontology issues session on triples and ontologies

6
W3C Life Science Ontology Issues Session on Triples and Ontologies

Upload: philomena-pope

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: W3C Life Science Ontology Issues Session on Triples and Ontologies

W3C Life Science Ontology Issues

Session on Triples and Ontologies

Page 2: W3C Life Science Ontology Issues Session on Triples and Ontologies

Why the Tower of Babel Exists Biomedical science is a bottom up enterprise

Efficiency of competitive systems Multiple independent discovery

External enabling technology and knowledge

Tension between dissemination and control Fundamental desire to be cited Fundamental need to control intellectual property

Implicit citation through nomenclature If you are using my name, you are citing my

discovery

Page 3: W3C Life Science Ontology Issues Session on Triples and Ontologies

Name Space Collisions

Molecular biology has an extraordinarily complex vocabulary Many terms with highly specific meanings that are used rarely Plasmid, pUC13, M13, cosmid, fosmid, yac, bac, pac, …

All cloning vectors, each with specific properties and uses High information content per word

Compression through acronyms => collisions across domains PCR

Polymerase Chain Reaction Historically, MeSH indexed PCR as an abbreviation for “premature contraction” in

cardiology Phosphocreatine in metabolism and physiology

Specific definitions with high information content Association

Generally a rather vague relationship In statistical genetics, a precisely defined criteria implying that specific tests for

significance have been met.

Page 4: W3C Life Science Ontology Issues Session on Triples and Ontologies

Biomedical Text is Not “Well Classifiable” Classifiable domain

Well defined robust classes Class definitions ~robust to algorithms and metrics

Poorly classifiable domains Class boundaries not clear, class definitions not robust Really just saying the best classification is one big class

Biomedical text is a web, not a collection of well defined domain specific corpuses Is an article about P53 molecular biology, gene expression

regulation or cancer biology?

Page 5: W3C Life Science Ontology Issues Session on Triples and Ontologies

Probabilistic Nature of Biomedical Knowledge Bayes rule

I know what I have observed I can only probabilistically rank hypotheses Understanding evolves as more data becomes

available Language links to understanding

As the understanding evolves, the meaning of the language evolves

Ask 3 biologists to define a gene and you will get 5 definitions and 2 dissenting opinions

Page 6: W3C Life Science Ontology Issues Session on Triples and Ontologies

Questions for Ontologies Session How to represent probabilistic concepts and

meanings with logically precise standards? How do we associate the appropriate domain

specific ontology(ies) with the text we are analyzing?

How do we create sustainable merges across evolving domain specific ontologies?