intogen & gitools

59
IntOGen & Gitools integration, visualization and data-mining of multidimensional oncogenomic data Christian Pérez-Llamas Master student Biomedical Genomics GRIB-UPF April 2010

Upload: christianperez

Post on 25-May-2015

562 views

Category:

Documents


0 download

DESCRIPTION

There is an increasing amount of oncogenomic data available in the last years, and more is to come. The main challenges the scientific community is and will be facing are the integration of this data to extract new knowledge and the intuitive visualization of the results obtained in the analysis. Here two complementary but independent tools for the analysis of oncogenomic data are presented: IntOGen and GiTools. IntOGen is a framework that includes public oncogenomic data and integrates it in different ways. Its main purpose is to identify those genes which are consistently altered (up or down-regulated) across many samples in a specific experiment, and combine all experiment from a same cancer type to end up having a p-value for a gene and cancer type. This same principle can then be applied to gene modules, or sets, which consist of groups of genes that share a biological property (module analysis). IntOGen has a web page from where the user can explore the datasets included in the database, from individual genes in all cancer types to different experiments, or gene modules (GO terms, KEGG pathways or user-defined groups of genes) across all the experiments. GiTools is a desktop-based framework developed also by the lab which allows the analysis and visualization of genomic data. It supports different input formats (all plain text) and data can even be imported from BioMart, so everything stored in that database can be used directly in GiTools. Also there is an IntOGen data importer, so users can download matrices or oncomodules at different levels (experiments or combined results) and use them directly. Right now it can perform a limited number of analysis (enrichment analysis, correlations, results combination...) but it is built in a modular fashion and it can be easily expanded to include more matrix-based statistical tests. It allows the flexible exploration of the data and creating figures for papers from there directly, which can be exported in many different formats. Two case studies are presented to illustrate the combined usefulness of these tools, aiming to answer two main questions: “what biological processes are enriched in genes siginificantly up-regulated in cancer?” and “what is the correlation between different tumour types for the pattern of genes up-regulated?”. Also different real applications of these tools are presented, both from published and unpublished research, stressing that they can be used not only in oncogenomics projects, but also in evolution and global gene regulation. In the near future GiTools will be incorporating new analysis, such as GSEA and clustering, and connections with the R statistical framework. IntOGen will soon have a Biomart-compatible interface, which will make the data even more easily available.

TRANSCRIPT

Page 1: IntOGen & Gitools

IntOGen & Gitools

integration, visualization and data-mining ofmultidimensional oncogenomic data

Christian Pérez-LlamasMaster student

Biomedical GenomicsGRIB-UPFApril 2010

Page 2: IntOGen & Gitools

Outline● Introduction● Case study● Real projects● Conclusions● Future work

Page 3: IntOGen & Gitools

Outline● Introduction● Case study● Real projects● Conclusions● Future work

Page 4: IntOGen & Gitools

Gundem et al., Nature Methods 2010

Page 5: IntOGen & Gitools

Identification of cancer related genes

identification of driver alterations

STEP 1

exp.

1

+

combination of experiments

STEP 2

exp.

2

exp.

3

exp.

n

Cance

r ty

pe A

samples

genes

not alteredaltered

genes

experiment 1

...

corrected p-value

0.05 10

International Classification of Disease from Word Health Organization

Page 6: IntOGen & Gitools

Identification of modules significantly altered in cancer

Page 7: IntOGen & Gitools

www.intogen.org

Page 8: IntOGen & Gitools
Page 9: IntOGen & Gitools
Page 10: IntOGen & Gitools
Page 11: IntOGen & Gitools
Page 12: IntOGen & Gitools
Page 13: IntOGen & Gitools

www.gitools.org

Page 14: IntOGen & Gitools

Analysis Browse ExportData

Page 15: IntOGen & Gitools

Many File Formats Supported

TSVCDMBDMGMXGMTTCM

AnalysisData Browse Export

Page 16: IntOGen & Gitools

Import data from:

● Genes significantly altered

● Modules of genes significantly altered

Data Levels Alterations

● Experiments

● Combinations

● Upregulation

● Downregulation

● Gain

● Loss

Marts

● International Cancer Genome Consorcium

AnalysisData Browse Export

Page 17: IntOGen & Gitools

AnalysisData Browse Export

Page 18: IntOGen & Gitools

AnalysisData Browse Export

Page 19: IntOGen & Gitools

AnalysisData Browse Export

Page 20: IntOGen & Gitools

Outline● Introduction● Case study● Real projects● Conclusions● Future work

Page 21: IntOGen & Gitools

Case study

● What biological processes are enriched in genes significantly up-regulated in cancer ?

● What is the correlation between different tumour types for the pattern of genes up-regulated ?

Page 22: IntOGen & Gitools

Retrieving data for the analysis

• Biological Process

Page 23: IntOGen & Gitools

Importing data from IntOGen

Page 24: IntOGen & Gitools

Importing data from IntOGen

Page 25: IntOGen & Gitools

Importing data from IntOGen

Page 26: IntOGen & Gitools

Importing data from IntOGen

Page 27: IntOGen & Gitools

Importing data from IntOGen

Page 28: IntOGen & Gitools

Importing data from IntOGen

Page 29: IntOGen & Gitools

Importing data from IntOGen

Page 30: IntOGen & Gitools

Importing data from IntOGen

Page 31: IntOGen & Gitools

Importing modules from Ensembl

Page 32: IntOGen & Gitools

Importing modules from Ensembl

Page 33: IntOGen & Gitools

Importing modules from Ensembl

Page 34: IntOGen & Gitools

Importing modules from Ensembl

Page 35: IntOGen & Gitools

Importing modules from Ensembl

Page 36: IntOGen & Gitools

Importing modules from Ensembl

Page 37: IntOGen & Gitools

Importing modules from Ensembl

Page 38: IntOGen & Gitools

Importing modules from Ensembl

Page 39: IntOGen & Gitools

Enrichment analysis

genes

mod

ule

s

Enrichmentanalysis

STEP 2

Tum

or

typ

e i

Annotated genes in module Mp-value

0.05 10

Biological modules

GO Biological processesTu

mor

type i

Transform to 1p-values < 0.05

STEP 1

Tum

or

typ

e i

... ...

...

Xi~Bin(p

i)

H0: p

m = p

i

H1: p

m > p

i

gen

es

Page 40: IntOGen & Gitools

Enrichment analysis

Page 41: IntOGen & Gitools

Enrichment analysis

Page 42: IntOGen & Gitools

Enrichment analysis

Page 43: IntOGen & Gitools

Enrichment analysis

Page 44: IntOGen & Gitools

Enrichment analysis

Page 45: IntOGen & Gitools

Enrichment analysis

Page 46: IntOGen & Gitools

Enrichment analysis

Page 47: IntOGen & Gitools

Enrichment analysis

Page 48: IntOGen & Gitools

Correlations

Page 49: IntOGen & Gitools

Correlations

Page 50: IntOGen & Gitools

Correlations

Page 51: IntOGen & Gitools

Correlations

Page 52: IntOGen & Gitools

Correlations

Page 53: IntOGen & Gitools

Outline● Introduction● Case study● Real projects● Conclusions● Future work

Page 54: IntOGen & Gitools

Real projects● RBP2 function

● Functional protein divergence

● Study of altered regulatory programs in cancer

● Stress response genes and transition into increased malignant states

● Comparison of alteration patterns among tumor types

Functional Enrichment of RBP2 targets at different time points of differentiation

Lopez-Bigas et al., Molecular Cell 2008

RBP2

Page 55: IntOGen & Gitools

● RBP2 function

● Functional protein divergence

● Study of altered regulatory programs in cancer

● Stress response genes and transition into increased malignant states

● Comparison of alteration patterns among tumor types

Real projects

Lopez-Bigas et al., Genome Biology 2008

Page 56: IntOGen & Gitools

Outline● Introduction● Case study● Real projects● Conclusions● Future work

Page 57: IntOGen & Gitools

Conclusions● IntOGen is a novel framework for Oncogenomics data

integration

● IntOGen.org is a discovery tool for cancer researchers

● Gitools main features are:

● Interactive heatmap● Import from Biomart● Import from IntOGen● Command line option

Page 58: IntOGen & Gitools

Future work● Biomart compatible interface for IntOGen● Implement more analysis:

● GSEA● Clustering● Modules hierarchy aware enrichment like Gostats● Connection with R

● Implement more editors:● Table and modules editor

Page 59: IntOGen & Gitools

AcknowledgementsNuria López-Bigas

Gunes Gundem

Jordi Deu-Pons

Khademul Islam

Michael Schroeder

Alba Jené-Sanz

Xavier Rafael

Remember to visitwww.intogen.orgwww.gitools.org