n if : a c omprehensive o ntology for n euroscience & p ractical g uide for d ata -o ntology i...

23
NIF: A COMPREHENSIVE ONTOLOGY FOR NEUROSCIENCE & PRACTICAL GUIDE FOR DATA-ONTOLOGY INTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Jeffery S. GRETHE, Amarnath GUPTA Univ. of California, San Diego, CA; George Mason Univ., Fairfax, VA; Yale Univ., New Haven, CT February 8, 2011 Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA. NEUROSCIENCE INFORMATION FRAMEWORK NIFSTD Ontologies neuinfo.org 1

Upload: kelley-houston

Post on 02-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF: A COMPREHENSIVE ONTOLOGY FOR NEUROSCIENCE & PRACTICAL GUIDE FOR DATA-

ONTOLOGY INTEGRATIONMaryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD, Jeffery S. GRETHE, Amarnath GUPTA

Univ. of California, San Diego, CA; George Mason Univ., Fairfax, VA; Yale Univ.,

New Haven, CT

February 8, 2011Funded in part by the NIH Neuroscience Blueprint HHSN271200800035C via NIDA.

NEUROSCIENCE INFORMATION FRAMEWORK

NIFSTD Ontologies neuinfo.org 1

Page 2: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF last year

NIFSTD Ontologies neuinfo.org 2

Page 3: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF today

• ~30M data records from 68 databases, NIF registry (3600 software tools, databases etc) + full text publication search– Focus of development now is on

integration of data with literature– Better search of data (SKOS?)• Annotation of data, now automated, will

become slightly more manual (we will assert the contents of columns that match parts of ontologies)

NIFSTD Ontologies neuinfo.org 3

Page 4: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF: DISCOVER AND UTILIZE WEB-BASED NEUROSCIENCE RESOURCES

A portal to finding and using neuroscience resources

A consistent framework for describing resources

Provides simultaneous search of multiple types of information, organized by category

NIFSTD Ontology, a

critical component Enables concept-based search

UCSD, Yale, Cal Tech, George Mason, Harvard MGH

Supported by NIH Blueprint

NIFSTD Ontologies neuinfo.org 4

Page 5: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF ‘dips’ into the lexicon for general searches like ‘cerebellum’ or ‘ontology,’ where users can contribute knowledge, but bring data into the lexicon by using a application that calls our web services

Page 6: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF today

• Ontology-based search– Search requires all search terms: synonyms/acronyms/lexical variation– Added gene: and other : searches are coming (toxin: drug:)– Application logic: String match to multiple ontology terms = bring

back all (e.g., striatum and caudate putamen)– Collapse duplicate classes by bridge files: same as relationship (Fahim)– Heavy use of defined classes (GABAergic neuron, hippocampal

neuron, drug of abuse etc)

Page 7: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

One problemNIF LAMHDI

Bioportal

NIFSTD Ontologies neuinfo.org 7

Page 8: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

Cow example?• Description of nose vs. tail:

which is more valid?• Should they point to the same

entity?• Is a mapping file the right place

to keep the knowledge that class A is related to class B, or should we assert sameness with Mireot?

vs.

Page 9: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF STANDARD ONTOLOGIES (NIFSTD)

• Set of modular ontologies – Covering neuroscience relevant

terminologies– Comprehensive 50,000+ distinct concepts

+ synonyms

• Expressed in OWL-DL language

• Closely follows OBO community best practices – As long as they seem practical

• Avoids duplication of efforts – Standardized to the same upper level

ontologies, e.g., – Basic Formal Ontology (BFO), OBO

Relations Ontology (OBO-RO), Phonotypical Qualities Ontology (PATO)

– Relies on existing community ontologies e.g., CHEBI, GO, PRO, OBI etc.

9NIF Standard Ontologies

• Modules cover orthogonal domain e.g. , Brain Regions, Cells, Molecules,

Subcellular parts, Diseases, Nervous system functions, etc.

Bill Bug et al.

NIFSTD Ontologies neuinfo.org 9

Page 10: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 10

ABOUT ONTOLOGY

• “Explicit specification of conceptualization” - Tom Gruber

• Organizing the concepts involved in a domain into a hierarchy and

• Precisely specifying how the concepts are ‘related’ with each other (i.e., logical axioms)

• Explicit knowledge are asserted but implicit logical consequences can be inferred – A powerful feature of an ontology

NIFSTD Ontologies neuinfo.org 10

Page 11: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 11

Class name Asserted necessary conditions Cerebellum Purkinje cell 1. Is a ‘Neuron’

2. Its soma lies within 'Purkinje cell layer of cerebellar cortex’3. It has ‘Projection neuron role’4. It uses ‘GABA’ as a neurotransmitter5. It has ‘Spiny dendrite quality’

Class name Asserted defining (necessary & sufficient) expressionCerebellum neuron Is a ‘Neuron’ whose soma lies in any part of the ‘Cerebellum’ or

‘Cerebellar cortex’ Principal neuron Is a ‘Neuron’ which has ‘Projection neuron role’, i.e., a neuron

whose axon projects out of the brain region in which its soma lies GABAergic neuron Is a ‘Neuron’ that uses ‘GABA’ as a neurotransmitter

ONTOLOGY – ASSERTED KNOWLEDGE

NIFSTD Ontologies neuinfo.org 11

Page 12: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 12

NIFSTD CURRENT VERSION

• Key feature: Includes a set useful defined concepts to have inferred classifications of asserted concepts

NIFSTD Ontologies neuinfo.org 12

Page 13: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIFSTD BRIDGE FILES

NIF-Molecule

NIF-Anatomy

NIF-CellNIF-

Subcellular

NIFSTD

NIF-Neuron-BrainRegion-Bridge.owl

Allows people to assert their own restrictions in a different bridge file without worrying about NIF-specific view of the restriction on core modules.

Cross-module relations among classes are assigned in a separate

bridging module.

NIF-Neuron-NT-Bridge.owl

Bridge

NIFSTD Ontologies neuinfo.org 13

Page 14: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

CONCEPT-BASED SEARCH• Search Google: GABAergic neuron• Search NIF: GABAergic neuron

– NIF automatically searches for types of GABAergic neurons

Types of GABAergic neurons

NIFSTD Ontologies neuinfo.org 14

Page 15: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies

NIFSTD AND NEUROLEX WIKI

• Semantic wiki platform• Provides simple forms for

structured knowledge• Can add concepts,

properties• Generate hierarchies

without having to learn complicated ontology tools

• Good teaching tool for principles behind ontologies

• Community can contribute– Each term gets its own unique

ID

15

Stephen D. Larson et al.

NIFSTD Ontologies neuinfo.org 15

Page 16: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 16

ACCESS TO SHARED ONTOLOGIES

• NIFSTD is available as– OWL Format http://ontology.neuinfo.org – RDF and SPARQL Endpoint

• Specific contents through web services – http://ontology.neuinfo.org/ontoquest.html

• Available through NCBO Bioportal– Repository of biomedical ontologies– 199 ontologies including NIFSTD– Provides annotation and mapping services– http://bioportal.bioontology.org/

• INCF Program on Ontologies for Neural Structure– Neuronal Registry Task Force: Description of neural

properties– Structural Lexicon: Description of structures across

scales

NIFSTD Ontologies neuinfo.org 16

Page 17: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 17

Domain External Source Import/ Adapt

NIFSTD Module

Organism taxonomy

NCBI Taxonomy, GBIF, ITIS, IMSR, Jackson Labs mouse catalog; . Specifically the taxonomy of model organisms in common use by neuroscientists

Adapt NIF-Organism

Molecules IUPHAR ion channels and receptors, Sequence Ontology (SO); pending: NCBI, NCBI Entrez Protein, NCBI RefSeq, NCBI Homologene; NIDA drug lists, ChEBI, and Protein Ontology (PRO)

Adapt IUPHAR; import PRO

NIF-MoleculeNIF-Chemical

Sub-cellular Sub-cellular AnatomyOntology (SAO). Extracted cell parts and subcellular structures from SAO-CORE . Soon to be importing GO Cellular Component with mapping

Import NIF-Subcellular

Cell CCDB, NeuronDB, NeuroMorpho.org . terminologies; pending: OBO Cell Ontology

Adapt NIF-Cell

Gross Anatomy NeuroNames extended by including terms from BIRN, SumsDB, BrainMap.org, etc;Multi-scale representation of Nervous System Mac Macroscopic anatomy

Adapt NIF-GrossAnatomy

Nervous system function

Sensory, Behavior, Cognition terms from NIF, BIRN, BrainMap.org, MeSH, and UMLS

Adapt NIF-Function

Nervous system dysfunction

Nervous system disease from MeSH, NINDS terminology; pending: OMIM

Adapt/Import

NIF- Dysfunction

Phenotypic qualities

PATO Imported as part of the OBO foundry core Import NIF-Quality

Investigation: reagents

Overlaps with molecules above, especially RefSeq for mRNA, ChEBI, Sequence ontology; pending: Protein Ontology

import NIF-Investigation

Investigation: instruments, protocols, plans

Based on Ontology for Biomedical Investigation (OBI ) to include entities for biomaterial transformations, assays, data collection, data transformations.

Adapt NIF-Investigation

Investigation: resource type

NIF, OBI, NITRC, Biomedical Resource Ontology (BRO) Adapt NIF-Resource

Biological Process

Gene Ontology’s (GO) biological process in whole Import NIF-BioProcess

NIFSTD EXTERNAL COMMUNITY SOURCES

NIFSTD Ontologies neuinfo.org 17

Page 18: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 18

• So Far..– Overlaps are detected and mappings were carefully curated– Included a bridging module that asserts equivalencies

between NIF-Dysfunction and DOID• We could MIREOT DOID Classes as well • Drawback was loosing NIF’s annotation properties. • Having the bridgeing module allowed us to have contents from

both ontologies and to keep the mappings as well. (Did the same with NIF-Subcellular and GO-Cell Component)

• Collaborating on Mental Disorder - Addiction/ Substance related disorder with DOID group

• Taking a look at Barry Smith’s paper on Foundations for a realist ontology of mental disease

NIFSTD AND DOID COLLABORATION

Page 19: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 19

WORKING TO INCORPORATE COMMUNITY

• NeuroPsyGrid – http://www.neuropsygrid.org

• NDAR Autism Ontology – http://ndar.nih.gov

• Disease Phenotype Ontology– http://openccdb.org/wiki/index.php/Disease_Ontology

• Cognitive Paradigm Ontology (CogPO) – http://wiki.cogpo.org

• Neural ElectroMagnetic Ontologies (NEMO) – http://nemo.nic.uoregon.edu

NIFSTD Ontologies neuinfo.org 19

Page 20: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 20

SUMMARY AND CONCLUSIONS

• NIF project with NIFSTD is an example of how ontologies can be used to enhance search and data integration across diverse resources

• NIFSTD continues to create an increasingly rich knowledgebase for neuroscience integrating with other life science community

• NIF encourages the use of community ontologies for resource providers

NIFSTD Ontologies neuinfo.org 20

Page 21: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

Some questions:

• If someone asserts sameness should that be treated differently by others? – How would we know? Should there be a tool that would

search these assertions?

• Can a lexicon be used as a set of base classes for use in ontology building?– We took this approach with nervous system cells by

adding properties, then asserted hierarchies: • GABAergic neuron• Cerebellum neuron• Intrinsic neuron

NIFSTD Ontologies neuinfo.org 21

Page 22: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

Even more questions?

• If a term has no definition, then should it exist in the lexicon?

• Do tests belong in ontologies?

NIFSTD Ontologies neuinfo.org 22

Page 23: N IF : A C OMPREHENSIVE O NTOLOGY FOR N EUROSCIENCE & P RACTICAL G UIDE FOR D ATA -O NTOLOGY I NTEGRATION Maryann E. MARTONE, Fahim IMAM, Anita Bandrowski,

NIF Standard Ontologies 23

• NIFSTD Ontologieshttp://ontology.neuinfo.org

• NeuroLex Wikihttp://neurolex.org

• Neuroscience Information Framework(NIF)

http://neuinfo.org

NIFSTD Ontologies neuinfo.org 23