bio ontology drtc-seminar_anwesha
TRANSCRIPT
Bio-ontology : A case studySeminar 3
Presented by: Anwesha Bhattacharya3rd Semester, 2013-2015
MSLIS, DRTC
Seminar coordinator: Dr. Biswanath Dutta
OntologyGoals of OntologyApplication of ontologiesBioinformaticsImportance of BioinformaticsNeed of ontology in bioinformaticsBioinformatics taxonomyLibrary of bio-ontologiesRelations used in Cancer ontologiesTypes of relations for disease ontologyLimitationsConclusion & future prospects
CONTENTS:
20/11/143
Elements of ontologyAn ontology is most often conceptualized as comprising three main elements:
(1) a set of knowledge objects;
(2) a set of relations that form associations (relationships) between the knowledge objects;
(3) a set of axioms that provides rules and constraints for the relationships (e.g. if A is next to B, then B is next to A).
Source: A Framework for Understanding and Classifying Ontology Applications, Mike Uschold & Robert Jasper
20/11/144
20/11/145
Applications of Ontology
20/11/146
General applications• Communication
– Between people (may be informal)
– Between agents (formal ontologies)
• Inter-operability
• Representing and storing data (e.g., DB schema)
• To analyze domain knowledge
• Knowledge sharing within and between domains
• To make domain assumptions explicit
• To share common understanding of the structure of information among people or software agents.
• Classification and organization of data resources
• Establishing contacts
• Systems Engineering Benefits:
– Re-Usability
– Search and retrieval
– Reliability
– Specification
– Maintenance
– Knowledge acquisition
20/11/147
Bioinformatics
20/11/148
Bioinformatics Bioinformatics is the application of information
technology to the field of biology.
The term Bioinformatics was coined by Pauline Hogweg in 1979 for the study of informatics processes in biotic systems.
Bioinformatics is an interdisciplinary field that develops and improves on methods for storing, retrieving, organizing and analyzing biological data.
20/11/149
Relation b/w ontologies, biology, computer science and philosophy
20/11/1410
Source: Schulze-Kremer, S. (2002). Ontologies for molecular biology and bioinformatics. N Silico biology 2, 0017.-
Importance of Bioinformatics
20/11/1411
Why Bioinformatics?Bioinformatics techniques such as image and signal processing allow extraction of useful
results from large amounts of raw data in the field of biology.
In the field of genetics and genomics, it aids in sequencing and annotating genomes and their observed mutations.
It plays a role in the textual mining of biological literature.
• Ultimate goals:
i) Uncover the wealth of biological information hidden in the mass of sequence, structure, literature and other biological data.
ii) Obtain a clearer insight into the fundamental biology of organisms and use this information to enhance the standard of life for mankind.
20/11/1412
Why Bioinformatics? (contd...)
It plays a role in the analysis of gene and protein expression and regulation.
Development of biological and gene ontologies to organize and query biological data.
Aids in the simulation and modeling of DNA, RNA, and protein structures as well as
molecular interactions.
Analyze and catalogue the biological pathways.
Bioinformatics can be used in various fields, as given below:
• Molecular medicine
• Gene therapy
• Antibiotic resistance
• Drug development
• Biotechnology
• Forensic analysis of microbes
• Evolutionary studies
• Waste cleanup
• Crop improvement
• Development of Drought resistance varieties
• Insect resistance
• Improve nutritional quality
• Veterinary Science
• Climate change studies
• Bio-weapon creation
20/11/1414
20/11/1415
Source: Stevens, R., Goble, C. A. and Bechhafer, S. (2000). Ontology-based knowledge representation. Briefing in Bioinformatics. Vol. 1(4) : 398-414.
20/11/1416
Why ‘ontologies’ play an important role in Bioinformatics?
• Create standards
• Interoperability
• Exploring large data sets – Use in investigating gene function.
• Mapping knowledge domains – Creating an ontology network that allows a user working in one area to take advantage of knowledge from a related area.
20/11/1417
Growth of bio-ontology papers
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 20130
50
100
150
200
250
300
350
400
450
Source: Numbers of articles on “bio-ontology/ies” in PubMed/MEDLINE as on 15.9.2014
20/11/1418
Bio-ontology Timeline (1992-2006)
20/11/1419
Bio-ontology timeline (2009-2012)
Column2LogMap
Mery-BGranatum
0
2
4
6
8
10
12
14
66
66
20/11/1420
Bioinformatics
Anatomy (45)
Human
Animal
Microbes
Plant
Health (91)
Disease
Drug
Cell biology (11)
Processes
Metabolism
DNA repair
Structures
Cell wall
Cell membrane
Nucleus
Mitochondria
Biochemistry (44)
Biological processes
Carbohydrates
Lipids
Proteins
Nucleic acids
Genomics & Proteomics
(18)
Genetics
Protein structure
Immunology (6)
Classification of bioinformatics ontologies
Note: There are 168 other ontologies which are not included here due to some discrepancies
20/11/1421
20/11/1422
20/11/1423
Library of bio-ontologies
Ontology name Number
Anatomy 45
Health 91
Cell biology 11
Biochemistry 44
Genomics & Proteomics 18
Immunology 6
Others 168
Total 383
Source: http://bioportal.bioontology.org/ontologies
20/11/1424
Bio-ontology distribution
45
91
11
44
18 6
AnatomyHealthCell biologyBiochemistryGenomics & ProteomicsImmunology
20/11/1425
Ontologies on Health
51
31
9 Health
Others (e.g FHHO, IMMDIS, OPE, OVAE etc.)DiseasesDrug
7
31
Cancer
Diseases
20/11/1426
Why cancer research is important? With cancer being a leading cause of death worldwide, it
seems obvious that it would be an important research focus.
Cancer research is very important in the fight against cancer.
Cancer research is crucial to improve the prevention, detection and treatment of cancers.
Cancer research will benefit the next generation of cancer patients, research is also extremely important for cancer patients being treated today.
Cancer ontologies would aid in exploring new avenues leading to contributions in cancer research.
20/11/1427
Names of cancer ontologies Description
Neomark Oral Cancer Ontology, version 4 (NEOMARK4)
Ontology that describes the medical information necessary for early detection of the oral cancer reoccurrence extracted from the NeoMark Project.
Neomark Oral Cancer Ontology, version 3 Ontology that describes the medical information necessary for early detection of the oral cancer reoccurrence extracted from the NeoMark Project.
Cancer chemoprevention Ontology (CANCO) The Cancer Chemoprevention Ontology constitutes a vocabulary that is able to describe and semantically interconnect the different paradigms of the cancer chemoprevention domain.
National Cancer Institute Thesaurus (NCIT) A vocabulary for clinical care, translational and basic research, public information and administrative activities.
Cancer Research and Management ACGT Master Ontology (ACGT-MO)
The intention of the ACGT Master Ontology (MO) is to represent the domain of cancer research and management in a computationally tractable manner.
Upper-Level Cancer Ontology (CANONT) Providing an upper-level ontology for cancer.
Breast Cancer Grading Ontology assigns a grade to a tumor starting from the 3 criteria of the Next Generation Sequencing (NGS) for cancer diagnostics
20/11/1428
Motivations: Cancer Ontologies
Name of the Projects Description
A Social Collaborative Working Space Semantically Interlinking Biomedical Researchers, Knowledge and Data for the Design and Execution of In-Silico Models and Experiments in Cancer Chemoprevention.
Cancer Bench-to-Bedside(CaB2B)
caB2B, or cancer Bench-to-Bedside, is a tool that enables querying for cancer related information hosted anywhere on caGrid. It allows for Web-based queries, which can be stored for later re-use. caB2B users can semantically search and retrieve information from the NCBO Resource Index.
Cancer Genome Anatomy Project (CGAP)
The NCI's Cancer Genome Anatomy Project sought to determine the gene expression profiles of normal, precancer, and cancer cells, leading eventually to improved detection, diagnosis, and treatment for the patient. Resources generated by the CGAP initiative are available to the broad cancer community.
20/11/1429
Motivation: Cancer ontologies (Contd…)
• Ontologies provide a powerful mechanism for making conceptual information about cancer biology computationally available.
• Ontologies provide mechanism by which conceptual information can be attached to the current flood of cancer data and thereby help turn data into useful knowledge.
• Developing a standard vocabulary for cancer ontologies as per requirements.
• Due to the heterogeneous information of the cancer ontologies, it is important to find out a homogeneity.
• Aids in mining various diseases, methods and treatments from biological text literatures.
20/11/1430
20/11/1431
Relation used in different cancer ontologies
Ontological relations & its utilities Relationships (also known as relations) between objects in an ontology specify how objects are related/associated to other objects.
Typically a relation is of a particular type (or class) that specifies in what sense the object is related to the other object in the ontology.
Much of the power of ontologies comes from the ability to describe relations. Together, the set of relations describes the semantics of the domain.
We mainly study the binary relations b/w the objects (here may be diseases, treatments, methods etc).
In this context the study of relations in the cancer ontologies would enable in mining the various diseases, methods, treatments which are yet to be extracted from different text literatures.
20/11/1432
Click to edit the outline text format
− Second Outline Level
Third Outline Level
− Fourth Outline Level
Fifth Outline Level
Sixth Outline Level
• Seventh Outline LevelClick to edit Master text styles
– Second level
– Third level
• Fourth level
– Fifth level
National Cancer Institute Thesaurus (NCIT)
20/11/1433
Breast cancer grading ontology Neomark Oral Cancer Ontology, version 4
20/11/1434
Click to edit the outline text format
− Second Outline Level
Third Outline Level
− Fourth Outline Level
Fifth Outline Level
Sixth Outline Level
• Seventh Outline LevelClick to edit Master text styles
– Second level
– Third level
• Fourth level
– Fifth level
Cancer Chemoprevention Ontology (CANCO)
20/11/1435
20/11/1436
Neomark Oral Cancer Ontology, version 4
Breast cancer grading ontology
Cancer Chemoprevention Ontology
National Cancer Institute Thesaurus
adjacent_to contained_inlocated_inlocation_of Deal with spacial relations
has_anatomical_entity has_gland has_tissueDeal with spacial relations
containsOrgancontainsTarget‘has disease location’Deal with spacial relations
Disease_Has_Normal_Cell_OriginDisease_Has_Normal_Tissue_OriginGene_Has_Physical_LocationGene_Has_Chromosomal_LocationDeal with spacial relations
derives_from ‘contain molecule’naturalVsSyntheticIndicate whether a Source is Natural or SynthetichasSourceChemopreventive Agent with the sources where it is available or from where it originates
‘related to disease’ All relations starting with Disease
part_of proper_part_of improper_part_of integral_part_of
part_ofcontainAssayStudy that a Bioassay is part of
part_of proper_part_of improper_part_of integral_part_of
‘induce prevent’‘interact with’
Chemical_Or_Drug_Affects_Abnormal_CellChemical_Or_Drug_Affects_Cell_Type_Or_TissueChemical_Or_Drug_Affects_Gene_Product
hasBiologicalMechanism Biological_Process
hasTargettarget of biological mechanism in order to prevent cancer.
hasTarget
Types of relations for disease ontology
Relations Relations Relations
associated patient inherence
initiator includes risk factor
parthood excludes transformation
origin affects spatial relation
abnormality effect participants
agent role constituent
treatment result
20/11/1437
Discussions
• Listing of the relations for disease ontologies is at a premature stage
• Whether the list is exhaustive/not is not known. Needs in-depth research.
• Attempt to provide an abstraction of relations which would aid in developing an upper level ontology for diseases in general.
• No specific relations have been mentioned as of now.
• The relations can be specified based on the purpose of the ontology.
• Using the relational ontologies. More entities can be identified.
20/11/1438
Limitations
• Time constraint
• Domain knowledge
• Study of 4 cancer ontologies
• All disease ontologies need to be considered to build a common framework
• Vast periphery
20/11/1439
Conclusions & Future prospects
• Biologically connected objects can be explored.
• Study of relations can aid in discovering the unexplored entities.
• Importance of studying the objects rather than the value of the property of the objects.
• Initiation of work for building a common framework for all the disease ontologies to be built in future
• Acceleration of the progress of future ontologists with the aid of developing a common framework for all ontologies on diseases.
20/11/1440
References• Schulze-Kremer, S. (2002). Ontologies for molecular biology and bioinformatics. N Silico biology 2, 0017.
• Stevens, R., Goble, C. A. and Bechhafer, S. (2000). Ontology-based knowledge representation. Briefing in Bioinformatics. Vol. 1(4) : 398-414.
• http://www.bioinformatics.kmutt.ac.th/download/seminar/bif04/Alisa_PPT1.pdf
• http://informatics.sdsu.edu/bioinformatics/
• Bodenreider, Olivier and Stevens, Robert. (2006) Bio-ontologies: current trends and future directions.
• Karp, Peter D. (2000). An ontology for biological function based on molecular interactions.
• Jonathan B. L. Bard* and Seung Y. Rhee. (2004). ONTOLOGIES IN BIOLOGY: DESIGN, APPLICATIONS AND FUTURE CHALLENGES. Nature reviews Genetics. Vol 5. p (213-222).
• http://www.w3.org/wiki/Semantic_Bioinformatics
• A Framework for Understanding and Classifying Ontology Applications, Mike Uschold & Robert Jasper
• http://www.cs.man.ac.uk/~horrocks/Teaching/cs646/Papers/uschold99.pdf
• http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
20/11/1441
Thank you