introduction to functional analysis -...

Post on 24-Sep-2019

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Daniel Rico, PhD. drico@cnio.es

::: Introduction to Functional Analysis

Course on Functional Analysis

Madrid, November 6th, 2012.

::: Schedule.

1.  Biological (Functional) Databases 2.  Threshold-based and threshold free methods 3.  Threshold-based example 1: FatiGO. 4.  Threshold-based example 2: DAVID.

Arabidopsis thaliana

Homo sapiens

Mus musculus

Rattus norvegicus

Drosophila melanogaster

Caenorhabditis elegans

Saccharmoyces cerevisae

Gallus gallus

Danio rerio

HGNC symbol

EMBL acc

RefSeq

PDB

Protein Id

IPI….

Genes IDs

Gene Ontology Biological Process Molecular Function Cellular Component

UniProt/Swiss-Prot

UniProtKB/TrEMBL

Ensembl IDs

EntrezGene

Affymetrix

Agilent

KEGG pathways

Regulatory elements miRNA

CisRed

Transcription Factor Binding Sites

REACTOME pathways

InterPro Motifs

Bioentities from literature:

Diseases terms Chemical terms

Gene Expression in tissues

Keywords Swissprot

Biological databases

Gene Ontology CONSORTIUM h"p://www.geneontology.org

•  The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products.

•  These terms are to be used as attributes of gene products by

collaborating databases, facilitating uniform queries across them.

•  The controlled vocabularies of terms are structured  

GO structure The three categories of GO Molecular Function

the tasks performed by individual gene products; examples are transcription factor and DNA helicase

Biological Process

broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions

Cellular Component

subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and origin recognition complex

GO tree structure

IS_A relation

PART_OF relation

Pathways  

What  is  a  Biological  pathway?  

ª  an  ordered   series  of  molecular   events   that   leads   to   the   crea:on  new  molecular  product,  or  a  change  in  a  cellular  state  or  process.      

Boehringer  Mannheim/Roche  "Biochemical  Pathways"  wall  chart  (1968)  

h6p://www.genome.ad.jp/kegg/pathway.html  

h6p://www.biocarta.com/genes/index.asp  

h6p://www.reactome.org/  

h6p://www.pathwaycommons.org  

h6p://www.whichgenes.org/  

::: Schedule.

1.  Biological (Functional) Databases 2.  Threshold-based and threshold free methods 3.  Threshold-based example 1: FatiGO. 4.  Threshold-based example 2: DAVID.

The two-steps approach

•  Genes of interest are selected using the experimental value.

•  Selected genes are compared to the background.

Threshold-based functional analysis

Study the enrichment in functional

terms in groups of genes defined by the experimental value.

FatiGO

GOminer

DAVID

Marmite

Threshold-free functional analysis

Select genes taking into account

their functional properties.

FatiScan

GSEA

MarmiteScan

•  Under a systems biology

perspective. •  Detect blocks of functionally

related genes.

Class1        Class2  

6est  cut-­‐off  

FDR<0.05  

FDR<0.05  

Biological  meaning?  

Threshold-based functional analysis  

ES/NES  staOsOc  

-­‐  

+  

Class1        Class2  

Gene    Set  1  

6est  cut-­‐off  

Gene    Set  2  

Gene    Set  3  

Gene  set  3  enriched  in  Class  2  

Gene  set  2  enriched  in  Class  1  

Threshold-free functional analysis  

Threshold-­‐based  funcOonal  analysis  

 Comparing  a  gene  list  of  interest  with  

a  background  gene  list  

6est  cut-­‐off  

FDR<0.05  

FDR<0.05  

List  1  

List  2  (background)  

Class1        Class2  

List  1b  /  List  2b  

Testing the distribution of GO terms among two lists of genes

Biosynthesis 60% Biosynthesis 20%

Sporulation 20% Sporulation 20%

List 1

List 2

(background)

Genes in List 1 are significantly enriched in biosynthesis, but not in sporulation.

Are this two groups of genes

carrying out different

biological roles?

8 4 No biosynthesis 2 6 Biosynthesis B A

§  List1: genes of interest (they are significantly over- or under-expressed when two classes of experiments are compared, co-located in the chromosomes, etc.)

§  List2:the background (typically the rest of genes). §  Select suitable database, Run...

List2

Remove genes repeated in list1

Remove genes repeated between

both lists

Remove genes repeated in list2

Extract functional

terms

Comparing groups of genes

List1 “clean” List1

“clean” List2

DAVID/BABELOMICS

GO

KEGG Interpro

KW Bioentities

Gene Expression

TF ...

011000101010101001 ...... 11001010 ........... 010001010 ........... 0110001010 ........... 1111001111...............

Matrix of functional

terms Fisher´s test/

EASE test

Adjust p-value by FDR

Significant functional

terms

DAVID  

http://david.abcc.ncifcrf.gov/

Click in here

Explore the different annotations you wish to use

Try Clustering, Chart and Table... what are the differences?

::: How the functional profiling should never be done

 It  is  not  uncommon  to  find  the  following  asserOon  in  papers  and  talks:  “then  we  examined  our  set  of  genes  selected  in  this  way  (whatever)  and  we  discover  that  65%  of  them  were  related  to  metabolism,  so  we  can  conclude  that  our  experiment  acOvates  metabolism  genes”.    

AnnotaOon  is  not  a  funcOonal    enrichment  analysis  result!!!  

::: BioGPS: A nice tool to explore gene annotation !!

http://biogps.org/

FaOGO  

http://babelomics.bioinfo.cipf.es/

http://bioinfo.cipf.es/babelomicstutorial/enrichment_analysis

More  examples  in:  

http://www.reactome.org/ReactomeGWT/entrypoint.html#PathwayAnalysisDataUploadPage http://www.reactome.org/userguide/Usersguide.html#Pathway_Analysis

New  tool  integrated  in  REACTOME  

Many  of  these  slides  have  been  taken  and  adapted  from  originals  by:    Gonzalo  Gómez    Babelomics,  DAVID  and  Reactome  teams.      

ACKNOWLEDGEMENTS  

top related