Download - Ontologies for big data
ONTOLOGIES FOR BIG DATA
Asiyah Yu Lin, M.D., M.S,. Ph.D.
My profile
<location>USA</location><work>postdoctoral training</work><work:company_type>University</work:has_degree><work:has_title>research fellow</work:has_title><bioinformatics>ontology development</bioinformatics><bioinformatics>social network analysis</bioinformatics><bioinformatics>ontology applying data analysis</bioinformatics>
Postdoc training:Ontologies
<location>Japan</location><work>institution</work><work:company_type>non profit organization</work:has_degree><work:has_title>bioinformatician</work:has_title><bioinformatics>454sequence assembly</bioinformatics><bioinformatics>non model organism sequence analysis</bioinformaticsl>
Bioinformatician:
NGS
<location>Japan</location><education:has_degree>Ph.D</education:has_degree><education:has_major>medical informatics</education:has_major><bioinformatics>ontology</bioinformatics><bioinformatics>data integration</bioinformatics><bioinformatics>biological pathway analysis</bioinformatics>
Ph.D. in Medical
Informatics<location>China</location><work>industry</work><work:company_type>start_up IT </work:has_degree><work:has_title>content manager</work:has_title><work:has_title>project manager</work:has_title><IT_skill>web site building</IT_skill><IT_skill>relational database </IT_skill>
Content Manager &
Project Manager
<location>China</location><education>Medical School </education><education:has_degree>master</education:has_degree><education:has_major>molecular immunology</education:has_major><bioinformatics>sequencing</bioinformatics><bioinformatics>protein 3D simulation</bioinformatics>
Master in Molecular
Immunology<location>China</location><education>Medical School </education><education:has_degree>bachelor</education:has_degree><education:has_major>Pediatrics</education:has_major>
M.D. in Pediatrics
Agenda
Introduction : ontologies, semantic web and big data Selected projects:
1. Informed Consent Ontology (ICO) 2. miRNA and Aging Ontology (MIAGO) 3. Ontology of Drug Neuropathy Adverse Event (ODNAE) 4. LINCS-BD2K 5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard Conclusion
Ontologies, form of knowledge representation, the structural frameworks for organizing terms hierarchically and defining relations between terms within a domain
1. A hierarchical vocabulary, class-subclass-instance2. Defined relations between terms to interlink the whole system3. Constrains and logical definitions4. Explicit specification of a conceptualization (Gruber,1993)
What is ontology?
Why ontology ?
Knowledge management• RDF, RDFS, OWLNatural language processing• Linguistic ontology: WordNet
E-commerceIntelligent information integrationKnowledge acquisition and discoveryDatabase design and integrationMedical decision making agentLinked Open Data, Semantic Web
Semantic Web Layer CakeRDF: simple triples, graph-based queries, supports very large amount of data Bill –has_address- Location A
OWL: significantly more expressive language, strong axioms, inference capabilities, consistency verification, but can be rather slowBill –has_address- Location A Location A –is_address_of- Bill
Inverse relation
SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysis Ontology of Drug Neuropathy Adverse Event 4. LINCS-BD2K5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
Informed Consent Ontology (ICO)
ICBO 2014 poster
SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysis Ontology of Drug Neuropathy Adverse Event (ODNAE)4. LINCS-BD2K5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
The power of reasoningmiRNA and Aging Ontology (MIAGO)
Database (in revision)
SELECTED PROJECTS1. Informed Consent Ontology (ICO)2. miRNA and Aging Ontology (MIAGO)3. Adverse event analysisOntology of Drug Neuropathy Adverse Event (ODNAE)4. LINCS-BD2K5. mebdo (Medicare and Census big data project)
SOCR Data Dashboard
drug-associated neuropathy AE
(ODNAE)
drug administration(OAE_0000011)
a drug(DrON, linked to
RxNORM, NDFRT)
preceded_by
chemical element(ChEBI)
has_proper_part
biological process (GO)
drug role in mechanism of
action (NDFRT)
has_role
is_realized
_in
human(NCBITaxon_9606)
occurs in
has participant
a quality (e.g., age)(PATO)
has_quality
has participant
neuropathy AE(OAE_0000418) is_a
bupropion (Aplezin, Wellbutrin, Zyban,
Budeprion SR, Buproban, Forfivo
XL)-associated neuropathy AE
(ODNAE_0000043)
drug administration(OAE_0000011)
Bupropion Oral Tablet
(DRON_00026665)
preceded_by
bupropion(CHEBI_3219)
has_proper_part
negative regulation of dopamine uptake
(GO_0051585)
has_specified_input
Dopamine Uptake Inhibitors [MoA] (N0000000114)
has_role
is_realized
_in
human(NCBITaxon_9606)
occurs in
has participant
age(PATO_0000011)
has_quality
has participant
neuropathy AE(OAE_0000418)
is_a
(A)
(B)
drug product(DrON_00000005)is_a
has_specified_input
drug product(DrON_00000005)is_a
negative regulation of neurotransmitter
uptake (GO_0051581)
is_a
ODNAE:Linking knowledge together
ODNAE results: 215 neuropathy AE drugs knowledge base
related AEs and 20 AE types
(A) (B)
127127
11887
1161
132096153911
217
14
1
related chemical compounds
139 Mode of Action ICBO 2015 VDOS workshop
What’s missing in ODNAE Only 13 GO biological processes were mapped to some MoA. Holistic analytic methods are needed to understand the mechanism.
We need more…
1. LINCS-BD2K
2. SCOR DASHBOARD
University of Miami Computational LINCS Center LINCS Data Coordinating Center http://lifeKB.org
BD2K LINCS Data Coordination and Integration Center http://lincs-dcic.org/
NIH LINCS Program
16
Library of Integrated Network-based Cellular Signatures
Drug and Gene Knockdown Followed by Genome-Wide Expression
KO and Mutant Genes and their Disease Phenotypes
Drug and Knockdown Effects on Cell Viability
Transcription Factors and Histone Modifications Profiled by ChIP-Seq
Protein-Protein Interactions and Cell- or Metabolic-Pathways
Gene Expression from Patient Cohorts with Genomics and Clinical Outcome
Data
Drugs and Toxic Chemicals that Cause Adverse Events
Networks
Bi-partite Graphs
Gene-Set Libraries
Hierarchical Trees
Drugs Side Effects
Genes
Diseases
ProteinsSignatures
PatientTumors
Cancer Cell Lines
Tissues Mutations
MousePhenotypes
Bi-Partite Relationships Between Data Types
Data integration and systems modeling
19
SOCR Analytics DashboardStatistics Online Computational Resource
Provide graphical querying, navigating and exploring the multivariate associations in complex heterogeneous datasets.
Integrate dispersed multi-source data and service the mashed information via human and machine interactions in a secure, scalable manner.
http://socr.umich.edu/HTML5/Dashboard/
1. Ontologies are important components for Big Data integration and manipulation.2. Reuse ontologies will enable seamless integration with other resources.3. However, ontologies can not solve all the problems in biomedical world; they are tools to support science.4. Formalized ontologies can be used by humans and automated systems as a basis for communication and data exchange (such as RDF data)5. Ontologies based application may go beyond reasoning alone and use statistical analyses (enrichment), semantic similarity, network analysis, graph algorithms, clustering, etc.6. Many more to explore in the big-data era.
Conclusion: