netbiosig2012 eriksonnhammer

31
Erik Sonnhammer Stockholm Bioinformatics Centre Science for Life Laboratory Dept. Biochemistry and Biophysics Stockholm University Comparative Interactomics with FunCoup 2.0

Upload: alexander-pico

Post on 10-May-2015

2.081 views

Category:

Health & Medicine


1 download

TRANSCRIPT

Page 1: NetBioSIG2012 eriksonnhammer

Erik Sonnhammer

Stockholm Bioinformatics CentreScience for Life Laboratory

Dept. Biochemistry and BiophysicsStockholm University

Comparative Interactomicswith FunCoup 2.0

Page 2: NetBioSIG2012 eriksonnhammer

How to map the human interactome?

• Genes: ~22000• Interactions: 100000-300000?• Known direct interactions:

~74000 (Intact)

• Experiments have high false negative and false positive rates.

• → Most interactions needto be inferred combinatorially

Page 3: NetBioSIG2012 eriksonnhammer

FunCoup:FunCoup:

Predicting Predicting

FunFunctional ctional CoupCoupling Between Genes/Proteins ling Between Genes/Proteins

Using Genomics Data and OrthologyUsing Genomics Data and Orthology

• Alexeyenko et al., NAR 40:D821 (2012)

• Alexeyenko & Sonnhammer, Genome Research 19:1107 (2009)

Page 4: NetBioSIG2012 eriksonnhammer

FunCoup Protein-protein interactions

Co-expression patterns Phylogenetic

profilesDomain interactions

Shared transcription factor binding

Other Organisms

OrthologyShared miRNAtargeting

Subcellularco-localisation

Genetic interactions

Page 5: NetBioSIG2012 eriksonnhammer

Naïve Bayesian training

Continuous variable

Discrete categories

Extract links

Test against positive and ”negative” reference datasets

Calculate enrichment as likelihood ratio = P(+) / P(-)

1 204

+

-

+

-

+

-

-1.0 1.0

0.6 1.0

Page 6: NetBioSIG2012 eriksonnhammer

FunCoup prediction of 1 linkRaw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Raw data

Bayesian LLR score

Sum of LLR scores

Confidence valuepfc

Page 7: NetBioSIG2012 eriksonnhammer

Naïve Bayesian training• Training:

– Learn log likelihood ratios (LLRs) for each individual evidence bin– When predicting, sum all the LLRs to a full Bayesian score (FBS).

∑=

=||

1 )()|(

log)(ε

εi ij

ij

EPFCEP

FBS

FC Functional coupling

ε Set of evidencesEij Evidence i, bin j

Page 8: NetBioSIG2012 eriksonnhammer

4 training datasets → 4 different types of functional coupling

• Metabolic pathway(KEGG)

• Signalling pathway(KEGG)

• Physical protein-protein interaction

• Complex member

Page 9: NetBioSIG2012 eriksonnhammer

FunCoup training

Human

Mouse

Rat

Fly

Worm

Yeast

Plant

MEXMIR

SCLPPI

PEXPHP

TFBDOM

10 7

10 5

10 3

INPUT DATA

HumanMouse

Rat

Fly

Worm

Yeast

Plant

FC-PIFC-CM

FC-MLFC-SL

5000

10000

15000

20000

25000

TRAINING SETS

BAYESIAN FRAMEWORK

ƒx, ƒy, ƒz, …

×

Page 10: NetBioSIG2012 eriksonnhammer

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=7.9ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.8

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.5

FC-SL modelFC-ML model

ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=5.8ΣSL =0+0-0.6+1.2-0.4+0.2+1.2+6.8+1.4=7.9

Raw data metrics on CDC2 – KPNB1Fly MEX (Li and White, 2003) PLC=0.42Rat MEX (Di Giovanni et al., 2004) PLC=0.48Mouse SLC (UniProt, ESLDB) WMI=0.04Mouse MEX (Zapala et al., 2005) PLC=0.70Mouse MEX (Su et al., 2004) PLC= -0.01Mouse MEX (Siddiqui et al., 2005) PLC=0.56Mouse MEX (Hutton et al., 2004) PLC=0.61Human PPI (IntAct, HPRD, BIND) PPI score=0.17Human MEX (Su et al., 2004) PLC=0.60…

FC-PI modelFBSPI = 0+0-0.6+1.2-0.4+0.2+1.2+6.3+1.4…= 11.2

FC-CM model

FC-SL modelFC-ML model

FC-PI modelFBSPI = 0+0-0.6+1.2-0.4+0.2+1.2+6.3+1.4…= 11.2

FC-CM model

(pfc scores)

Page 11: NetBioSIG2012 eriksonnhammer

FBS score and pfc confidence

∏∏

==

=

+= ||

1

||

1

||

1

)()|()(

)|()()( εε

ε

ε

iij

iij

iij

EPFCEPFCP

FCEPFCPpfc

∑=

=||

1 )()|(

log)(ε

εi ij

ij

EPFCEP

FBSFC Functional coupling

ε Set of evidencesEij Evidence i, bin j

Page 12: NetBioSIG2012 eriksonnhammer

The total human FunCoup 2.0 network

0500,000

1,000,0001,500,0002,000,0002,500,0003,000,0003,500,0004,000,0004,500,0005,000,000

Nr of links

0.1 0.25 0.75Confidence cutoff

Page 13: NetBioSIG2012 eriksonnhammer

Nr of links at pfc cutoffs

0

2000000

4000000

6000000

8000000

10000000

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95

pfc cutoff 

# lin

ks

H. sapiens

M. musculus

R. norvegicus

C. familiaris

D. rerio

C. intestinalis

D. melanogaster

C. elegans

G. gallus

A. thaliana

Page 14: NetBioSIG2012 eriksonnhammer

Comparison to STRING

• FunCoup on average 75% larger (based on all links)

A. thalianaC. elegans

C. familiarisC. intestinalis

D. melanogasterD. rerio

G. gallusH. sapiens

M. musculusR. norvegicus

S. cerevisiae

0

1000000

2000000

3000000

4000000

5000000

FunCoup 2.0STRING 9.0

Page 15: NetBioSIG2012 eriksonnhammer

Support from species and evidence type

MEX: mRNA co-expression

PHP: phylogenetic profile similarity

PPI: protein–protein interaction

SCL: sub-cellular co-localization

MIR: co-miRNA regulation by shared miRNA targeting

DOM: domain interactions

PEX: protein co-expression

TFB: shared transcription factor binding

GIN: genetic interaction profile similarity

Page 16: NetBioSIG2012 eriksonnhammer

Validation: Recovering cancer pathways

• 36 signalling links in RTK/RAS/PI(3)K, p53, and RB signalling pathways (TCGARN, Science 2008).

• FunCoup predicted 29 of 36 links.

• 25 more links found.

Page 17: NetBioSIG2012 eriksonnhammer

Independent validation:Recovering tumour mutation sets

• Lists of genes co-mutated in glioblastoma tumours (The Cancer Genome Atlas).

• 6 of 9 lists (>= 10 genes) enriched (p<10-3) with internal FunCoup connections compared to random networks (preserving degree distribution).

Page 18: NetBioSIG2012 eriksonnhammer

FunCoup

Cross-talk between groups

Find novel interactions

Find network modules

Extend pathways

Find novel disease genes

FunCoup applications

Page 19: NetBioSIG2012 eriksonnhammer

http://FunCoup.sbc.su.se

ASPM - Abnormal spindle-like microcephaly-associated protein

ASPM

Page 20: NetBioSIG2012 eriksonnhammer
Page 21: NetBioSIG2012 eriksonnhammer

Data details

Page 22: NetBioSIG2012 eriksonnhammer

Klammer M, Roopra S, Sonnhammer EL. ”jSquid: a Java applet for graphical on-line network exploration” Bioinformatics 2008, 24:1467

Page 23: NetBioSIG2012 eriksonnhammer
Page 24: NetBioSIG2012 eriksonnhammer
Page 25: NetBioSIG2012 eriksonnhammer
Page 26: NetBioSIG2012 eriksonnhammer

Comparative interactomics

New in FunCoup 2.0 – ensures true conservation

Page 27: NetBioSIG2012 eriksonnhammer
Page 28: NetBioSIG2012 eriksonnhammer

Human presenilin in worm

Page 29: NetBioSIG2012 eriksonnhammer
Page 30: NetBioSIG2012 eriksonnhammer

RNA-polymerase II subunits: yeast-all

Page 31: NetBioSIG2012 eriksonnhammer

Comparative interactomicsApplications

• Hypothesis testing– Is a given pathway/complex conserved in another species?

• New discoveries– Finding ortholog pairs with conserved functional coupling – very

strong evidence for functional conservation– Can also find conservation that is not strictly 4-way: