genomics2 phenomics complete
TRANSCRIPT
![Page 1: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/1.jpg)
18th June 2001017th December 2009
Genomics to Phenomics:The Complex Journey in Big Data Biomedicine
Asoke K Talukder, PhDInterpretOmics, Bangalore
Indian Society of Human Genetics41st Annual Meeting,
Sankara Nethralaya, Chennai, 3-5 March, 2016
![Page 2: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/2.jpg)
Acknowledgement
• Organizing Committee, ISHG2016
• Authors & Agencies Making their Research articles, and Data available in the open domain and Internet
• Authors of Open source Software
• NCBI, NIH, Wikipedia, Google & other Internet sites that believe in Bhikshu Economy by making their contents open in the Cloud
2March 3-5, 2016
![Page 3: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/3.jpg)
Hunting the “Dwarfing” Gene?
3March 3-5, 2016
Palm Oil – Activate the Dwarfing Gene (Genomics)Teak – Repress the Dwarfing Gene (Genomics)
![Page 4: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/4.jpg)
The Human Genome – Decoding the Book of Life
A Milestone for Humanity – the Human genome
Human Genome Completed, 26 June, 2000
Francis CollinsBill ClintonJ Craig Ventor
Craig Venter Bill Clinton Francis Collins
4March 3-5, 2016
![Page 5: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/5.jpg)
Trillion-Dollar Science to Trillion-Dollar Industry
5March 3-5, 2016
![Page 6: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/6.jpg)
The relationship between the number of stem cell divisions in the lifetime of a given tissue and the lifetime risk of
cancer in that tissue
Reference: Cristian Tomasetti, and Bert Vogelstein, Jan 2 Science 2015;347:78-81
6March 3-5, 2016
Reference: Norbert Stefan, et al, Divergent associations of height with
cardiometabolic disease and cancer: epidemiology, pathophysiology, and
global implications. The Lancet Diabetes & Endocrinology, 2016; DOI:
![Page 7: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/7.jpg)
Reduction Vs Integration
7March 3-5, 2016
Genomics (System)
(Genetics)
Talukder AK, Genomics 3.0, Big Data Analytics, Springer, 2015
![Page 8: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/8.jpg)
Evidence Based Science (Biology & Medicine)
8March 3-5, 2016
Genetics Genomics
Confirmatory Exploratory
Hypothesis Driven Hypothesis Creating
Component Holistic
Biology Statistical Data Mining
![Page 9: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/9.jpg)
Big Data in Biomedicine
The 7 Vs of Genomic Big Data
• Volume is defined in terms of the physical volume of the data that need to be online, like giga-byte (10^9), tera-byte (10^12), peta-byte (10^15) or exa-byte (10^18) or even beyond.
• Velocity is about the data-retrieval time or the time taken to service a request. Velocity is also measured through the rate of change of the data volume.
• Variety relates to heterogeneous types of data like text, structured, unstructured, video, audio etcetera.
• Veracity is another dimension to measure data reliability - the ability of an organization to trust the data and be able to confidently use it to make crucial decisions.
• Vexing covers the effectiveness of the algorithm. The algorithm needs to be designed to ensure that data processing time is close to linear and the algorithm does not have any bias; irrespective of the volume of the data, the algorithm is able to process the data in reasonable time.
• Variability is the scale of data. Data in biology is multi-scale, ranging from sub-atomic ions at picometers, macro-molecules, cells, tissues and finally to a population [9] at thousands of kilometers.
• Value is the final actionable insight or the functional knowledge. The same mutation in a gene may have a different effect depending on the population or the environmental factors.
9March 3-5, 2016
Reference: Talukder AK, Genomics 3.0: Big Data in Precision Medicine, Big Data Analytics, Springer, LNCS9498, 2015
![Page 10: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/10.jpg)
21st Century Biomedicine is a Multi-Scale Challenge
Genome
Transcriptome
Proteome
Cellular Structure and Function
Tissue Structureand Function
Organ Structure and Function
Patient
Molecular Scale
ηm~μm μm~mm mm~cm 1mηm
ηs
ηs-μs
μs-s
s~hour
hour~day
years
Molecular events (Eg: Ion-channel gating
Diffusion and cell signaling
Motility
Mitosis
Protein turnover
Human lifetime
10March 3-5, 2016
Genomics
Phenomics
![Page 11: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/11.jpg)
‘OMICS’ (High-throughput) Big Data Domains
GWAS
Population
GeneticsMicroarray
Systems
Biology
Phenomics
ChIp-Seq DNA-Seq
RNA-Seq
Exome-Seq
Repli-Seq
Small
RNA-Seq
Metabolic
Networks
Proteomics
Metagenomics
11March 3-5, 2016
![Page 12: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/12.jpg)
Multi Omics Big Data
12March 3-5, 2016
Reference: Talukder AK, Genomics 3.0: Big Data in Precision Medicine, Big Data Analytics, Springer, LNCS9498, 2015
![Page 13: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/13.jpg)
Hypothesis Creating Multi-Omics Big Data Analytics
13March 3-5, 2016
Genomic Big Data
Statistics
(Exploratory Data Analysis)
Phenomic &
Environmental
Knowledgebase
Systems Biology
Reference: Talukder AK, Genomics 3.0: Big Data in Precision Medicine, Big Data Analytics, Springer, LNCS9498, 2015
![Page 14: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/14.jpg)
Lung Cancer: A Multi-Omics Multi-ScaleBig Data Case Study using iOMICS Pipelines
We have taken a Lung squamous cell carcinoma study and reanalysed its data using iOMICS pipelines to unleash novel knowledge
The reanalysis is for a Lung Squamous Cell Carcinoma (SCC) 18 years Longitudinal clinical research published in PMID: 25189482.
The data consist of Omics data for 93 tumor patients and 16 healthy individuals. DNA level genotype data: 64 tumor samples, 373,398
DNA sites RNA level gene expression data: 109 samples, 20,117
genes Clinical data: General and clinical information (where
applicable) for all 109 individuals in the study. Survival information was also available in the form of overall and disease recurrence free survival
14March 3-5, 2016
![Page 15: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/15.jpg)
Multi-Omics Based Multi-Scale Analytics Framework
Data from patient is integrated with existing knowledgebases using a 3 step analysis framework
Top-down Exploratory Data Analysis: Analysis of experimental data for molecular information such as DNA mutations and gene expression
Multi-scale Integrative Analysis: Integration of molecular scale data such as DNA, RNA level results for mechanistic modeling
Bottom-up Integrative and Network Analysis:Integration of experimental data analysis results with existing knowledgebases for generalizability and improved quality results
Results from the framework can be used to power clinical decision support systems for treatment strategies and drug design
15March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics,
Springer, LNCS9498, 2015
![Page 16: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/16.jpg)
Patient Stratification
Integration of gene expression with recurrence free survival of patients
Recurrence free survival known for 87 samples
Used Cox regression to model survival time as a function of gene expression
Stratified patients into 3 response groups:
Good, Average and Poor prognosis
Aim: Markers of Patient Survival
Survival Probability Curves for Stratified Prognosis Groups
Top Significant Genes separating Poor and Average prognosis are: EIF5A, SCEL, ABCA11P, VAV2
Top Significant Genes separating Good and Average prognosis are: SLC7A11, G6PD, ALDH3A1, NQO1, SOST
Top Significant Genes separating Good and Poor prognosis are: SCEL, VAV2, PPP1R26, ZNF77, EIF5A
16March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 17: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/17.jpg)
Phenotype Based Patient Stratification
17© InterpertOmics
Data: Lung Squamous Cell Carcinoma with Basaloid Histology Study (PubMed-ID: 25189482) Clinical Data.
Overall
Survivability: The
basaloid tumor
samples have
poor overall
survival (OS)
compared to the
other samples. Fig
1 in the original
paper
Recurrence Free
Survivability:
Basaloid tumors
show distinctly
poor recurrence-
free survival (RFS)
compared to other
samples
Age factor:
Patients diagnosed
at an age of 53 or
less showed better
prognosis
compared to those
diagnosed later
Adjuvant
Radiotherapy
Factor: Patients
who did not receive
adjuvant
radiotherapy (Age
≤ 53) show better
overall survival,
compared to those
who did
Unique Findings:
- The basaloid subtypes showed distinctly poor prognosis compared to the other samples
- Adjuvant radiotherapy is not very effective for improving patient survival in these cases
- For patients diagnosed before 53 years of age, administration of adjuvant radiotherapy represents worse long term overall survival
Aim: Markers of Patient Survival
![Page 18: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/18.jpg)
Differential Gene Expression
Key differentially expressed protein-coding genes between the 2 cancer subtypes were identified
106 differentially expressed genes were identified based on the filtering criteria
Key differentially expressed genes were: KLHL23, IVL, MPZL2, KCNK6, SPRR3, ELL2, MALL, RPRD1A, ZNF124
p-value criteria ≤ 0.0001 and absolute log fold change > 0.6
Aim: Basaloid vs. SCC Molecular Comparison
18March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 19: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/19.jpg)
Mutation Association with Cancer
Identified DNA sequence sites with different genotypes between the two lung carcinoma subtypes (basaloid and SCC)
After linkage analysis and filtering, the 373,398 sites were reduced to 735 disease type associated DNA loci
These mapped to 558 unique genes
Aim: Basaloid vs. SCC Molecular Comparison
Karyotype Plot for Mutation Locations across Chromosomesp-value criteria ≤ 0.001 and odds ratio criteria ≥ 3
19March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 20: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/20.jpg)
We characterized the 558 mutated genes identified from the DNA level analysis using XomPathways
Results indicated the key pathways differentiating the tumor subtypes such as cell signaling and adhesion
Functional CharacterizationAim: Basaloid vs. SCC Molecular Comparison
p-value criteria ≤ 0.001 for pathway enrichment
20March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data
Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
Pathway-Pathway Network
Gene-Gene Network
Genes-Pathway Bipartite Network
![Page 21: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/21.jpg)
Multi-Scale Integrative Biology (Expression QTL)Aim: Basaloid vs. SCC Molecular Comparison
The DNA level variants identified for the basaloid
histology comparisons were compared with Gene
Expression levels to view the effect of mutations from
the DNA to RNA level
Expression levels of some of the associated genes
were altered with a large fold change
Interesting genes include:
CLCA2, CENPF, SHROOM3, ELL2, ATP10B, CASC15,
TIAM2, PROX1, EYA1, C10orf54, HOXC9, SCEL,
BCL2, FUT3, YPEL1, PATZ1, CAV2
21March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 22: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/22.jpg)
Multi-Scale Integration Mutations associated with expression level changes were identified
These were associated with up or down-regulation of gene expression
Genes-Mutations Integration
22March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 23: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/23.jpg)
Functional CharacterizationAim: Basaloid vs. SCC Molecular Comparison
Functional Enrichment highlights key pathways involved. For the top differentially expressed genes between tumor and normal samples.
The pathways and processes involved in epidermal and epithelial cell differentiation
Together, the functional analysis results show that the primary differences between the basaloid and SCC subtypes are associated with tissue structure
This is consistent with histology based distinction between the two subtypes
Genes -Biological Processes Bipartite Network
23March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 24: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/24.jpg)
Metabolic and Biochemical Reactions IntegrationAim: Identification of Potential Drug Targets
Genomic level alterations translate into protein and metabolism changes, which finally affect phenotype at a cellular and tissue level
Using expression data, metabolic network models were constructed for healthy and lung cancer samples
Recon X was taken as a reference genome scale model
Genes associated with maximum metabolic alterations can serve as effective targets
Carbohydrate Metabolism Pathways
Image source: Khazaei, T., McGuigan, A., Mahadevan, R.: Ensemble modeling of cancer metabolism. Frontiers in physiology 3 (2012)
24March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 25: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/25.jpg)
Solve Constrained Based Differential EquationsAim: Identification of Potential Drug Targets
Three Step process
Step I: Model initiation using constraint based modeling
Cancer state optimized for maximum growth
Healthy state optimized for maximum energy production
Step II: Identification of highly altered reactions and associated genes
Step III: Extension of gene list to include first degree PPI interactions as potential targets
Step I
Step II
Step III
25March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big
Data Analytics, Springer, LNCS9498, 2015
![Page 26: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/26.jpg)
Metabolic and Protein Network IntegrationAim: Identification of Potential Drug Targets
Identified Metabolic Reactions Network Protein-protein interactions for an identified gene (EIF1B)
26March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 27: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/27.jpg)
Potential targets were identified as genes with large association with altered reactions
High degree in the human protein interaction network for these genes indicates that effect of targeting these will impact more pathways and may be toxic to the cell
Identified potential drug targets include: NME2, GSR, YWHAZ, TGM2, JAM2, STAT3, TIMP2, RHOB, GIT2 and TK1
Systems Biology and the Small Molecule TargetsAim: Identification of Potential Drug Targets
27March 3-5, 2016
Reference: Agarwal M, Adhil M, Talukder AK, Multi-Omics Multi-Scale Big data Analytics for Cancer Genomics, Big Data Analytics, Springer, LNCS9498, 2015
![Page 28: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/28.jpg)
Conclusions from the Cancer/iOMICS Case Study Molecular differences between basaloid and SCC lung carcinoma subtypes:
Based on DNA and RNA level comparisons, we were able to identify genes involved in the differentiation of the two cancer subtypes.
We tracked the mutations in genes such as SHROOM3, PROX1, CLCA2 etc. to gene expression alterations.
The molecular level differences between the two subtypes were able to predict the cellular and tissue level differences seen between the subtypes
Molecular states associated with poor patient survival:
Identified genes involved in poor patient survival probabilities such as VAV2, EIF5A, SCEL etc.
Identified a hidden molecular subtype within the pure basaloid subgroup, having particularly poor prognosis
Identification of potential drug targets:
Based on the translation of gene expression to metabolic fluxes, we identified key altered metabolic pathways, reactions and associated genes which are putative drug targets
All analysis results were validated using extensive bibliomic data
28March 3-5, 2016
![Page 29: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/29.jpg)
Omnia Knowledgebase & Clinical Decision Support System
29March 3-5, 2016
Patient Specific Survival for breast cancer based on the patient
age, sex, grade and stage. There are 2,613 individuals with breast
cancer of age group 45-49, from SEER within Omnia
• For adjuvant therapeutic intervention A+B,
overall QALYs (Quality Adjusted Life Year)
is around 8 years and cost per QALY is
₹2,00,000; with likely disease burden of
~₹16,00,000 for 8 years of life.
• For drug A, the overall QALYs is around 6
years and cost per QALY is ₹80,000; with
likely burden of ~₹4,80,000 for 6 years of
life.
• Using this prognostic information, informed
decision can be made by considering the
QALYs and the total cancer burden.
Drugs with detailed description report for breast cancer type
chr16_g.69373414T>C (NIP7)Omnia contains curated Multi-Omics data (Variation, Expression, GO,
Pathway, Drug, and Pharmacogenomics) along with subjects’ clinical data
such as Demographics, Environmental, Phenotype and other attributes like
HGNC, OMIM, UMLS, ICD10, SEER, and MeSH terms. Currently, Omnia
contains more than 200,000 Variations, 100 Genomic experiments and 5000
Curated papers for Genotype-Phenotype relationships.
Reference: Adhil M, Talukder AK, Gandham S, Agarwal M, CuraEx: Clinical Expert System Using Big data for Precision Medicine,
Big Data Analytics, Springer, LNCS9498, 2015
![Page 30: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/30.jpg)
iOMICS – the MultiOmics Platform
30March 3-5, 2016
![Page 31: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/31.jpg)
iOMICS App Store
31March 3-5, 2016
![Page 32: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/32.jpg)
Enterprises Disrupting Biomedical Industries
InterpretOmics
(http://www.interpretomics.co)
Revolutionizing Genomics through Big data Multi-Scale
Multi-Omics Solutions
Singapore Life Sciences
Transforming Life Sciences and Precision Medicine
Applied Genetics Diagnostics
(http://www.appgendx.com)
The Next Generation Healthsciences company offering
Genetic Diagnostic Services
32March 3-5, 2016
![Page 33: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/33.jpg)
JNCASR
Some Of Our Collaborators/Customers
33March 3-5, 2016
![Page 34: Genomics2 Phenomics Complete](https://reader031.vdocuments.us/reader031/viewer/2022022203/5872ee1f1a28abfa548b7ab3/html5/thumbnails/34.jpg)
iOMICS Accelerate Your Biomedical Research –Making it Quicker, Reliable, and Affordable
InterpretOmicsOffice: Shezan Lavelle, 5th Floor,
#15 Walton Road, Bengaluru 560001
Sequencing Center: #329, 7th Main, HAL 2nd Stage,
Indiranagar, Bengaluru 560008
Phone: +91(80)46623800