reconstructing dynamic regulatory networks in …02710/lectures/dremreg.pdfmethods for...

29
Reconstructing dynamic regulatory networks in multiple species 02-710 Computational Genomics

Upload: others

Post on 12-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Reconstructing dynamic regulatory

networks in multiple species

02-710

Computational Genomics

Page 2: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Methods for reconstructing

networks in cells

Amit et al

Science 2009

Gerstein et al Science 2010

SLT2

CRH1

YPS3 YPS1

SLR3

Pe’er et al Recomb 2001, Segal

et al Nature Genetics 2003

Page 3: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Key problem: Most high-

throughput data is static

DNA

motif CHIP-chip

PPI microarray

Static data sources Time-series measurements

Time

Page 4: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Method: Integrating time series

expression and static protein-DNA

interaction data

Page 5: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

TF C

time

Expression

Level

Model Structure

time

1

0.1

0.9

1

0.95

0.05

Expression

Level

Time Series Expression Data Static TF-DNA Binding Data

IOHMM Model

TF A

TF B

TF D

?

?

a b

c d

Page 6: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Things are a bit more

complicated: Real data

Page 7: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

A Hidden Markov Model

T

t

tt

n

i

T

t

tt iHiHpiHiOpOHL2

1

1 1

))(|)(())(|)(();,(

Hidden States

Observed outputs

(expression levels)

t=0 t=1 t=2 t=3

H0 H1 H2 H3

O0 O1 O2 O3

Schliep et al Bioinformatics 2003

1

Page 8: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Sum over

all genes

Sum over

all paths Q

Product over all

Gaussian emission

density values on path

Product over all transition probabilities on path

Input – Output Hidden

Markov Model Input (Static transcription factor-

gene interactions)

Hidden States Variables (We constrain transitions

between states to form a

tree structure)

Output State Variables (Gaussian distribution for

expression values)

Ig

t=0 t=1 t=2 t=3

H0 H1 H2 H3

O0 O1 O2 O3

Bengio and Frasconi, NIPS 1995

Log Likelihood

Page 9: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Results: Yeast response

pathways

Page 10: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Application to AA starvation in

yeast with condition specific data Expression data: Gasch et al Mol. Bio. Cell. 2000,

Chip-chip data: Harbison et al Nature 2004

Ernst et al Nature MSB 2007

Page 11: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Application to AA starvation in

yeast with condition specific data

Amino acid transport 10-9

Ribosome biogenesis and assembly 4*10-21

Protein biosynthesis 3*10-72

Cellular Carbohydrate Metabolism 4*10-13

Nitrogen Compound Metabolism 5*10-23

Nucleotide Biosynthesis 2*10-12

Page 12: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Application to AA starvation in

yeast with general binding

and motif data

new predictions

Page 13: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Validating predicted interactions

for Ino4 Ino4 Occupancy in Gene Promoter Region at 0h and

4h

0

2

4

6

8

10

12

YDR497C YNL169C YGR196C YHR123W

Gene

Inte

ns

ity

SCD

AA Starvation

Page 14: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Validating predicted interactions

for Ino4 AA vs. SCD Binding for Ino4 Bound Genes on

Response Path (Repeat 1)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 1 2 3 4 5

SCD Repeat 1: -log base 10 p-value

AA

Rep

eat

1:

-lo

g b

ase 1

0

p-v

alu

e

• Ino4 regulates phospholipid biosynthesis

• Many genes in the path are known lipid metabolism genes (GO p-value

6*10-5).

• May be connected to the need of membrane used for the

autophagocytosis process which regulates equilibrium between proteins

and the diminishing set of amino acids

Page 15: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

1 2

3

4

5

6 7 8

9

Stress and hormone response

MSB 2007, PLoS

Comp. Bio. 2008,

eLife 2013

MSB 2011

IRF7

Fly development Science 2010

Genome Research 2010,

PLoS ONE 2011, Nature

Immunology 2013, ISMB

2013

Immune response

Stem cells differentiation

Page 16: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

microRNAs

Page 17: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

• Pearson-correlation (Cheng et al. 2008,

Huang et al. 2011)

• Regression with graphical models,

GenMiR++ (Huang et al. 2007)

expre

ssio

n

time

miRNA-target detection using expression data

expre

ssio

n

time

global association no association

miRNA

mRNA

time 1-2 time 2-3 time 3-4

miR1

mRNA1

expre

ssio

n

time

time-specific association

time is not implicitely modeled

no prediction of time-specific regulation

Disadvantages

methods often assume linear correlation

joint modeling of TF regulation seldom possible

Page 18: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

- Incorporate miRNA expression ratios with logistic function to generate

dynamic input map for miRNAs

-Enforce positivity constraints for miRNAs coefficients in the logistic

regression model (convex optimization, still global optimum)

Incorporating miRNAs into DREM

Page 19: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Dynamic regulaotry models for

lung development in mice

- Down regulated miRNA

- Up regulated miRNA

Page 20: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Validating miRNAs controling Week

1 – Week 2 transition miRNA sign. paths

opposite

direction

corrected

enrichment

p-values

miR-125a-5p A,B,C+E 0.0108,0.00008,0.0

3872

miR-337-5p B,C+E 0.0332, 0.00008

miR-467c D < 10-6

miR-466a-3p D 0.05152

miR-466d-3p D 0.03904

miR-30d H 0.01456

miR-30a - -

miR-23b - -

Page 21: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

From development to disease Analysis of significant miRNAs in

patients with Idiopathic Pulmonary

Fibrosis (IPF)

10 control

vs 10 IPF

(TissueBank

)

28

control

vs 33

IPF

(Tissue

Bank)

142 control vs

162 IPF

(LGRC)

bleomycin

at 14d ( 3

replicates)

Liu et al

data J. Ex.

Med. 2010 miR-125a down - - down miR-30a down down down down miR-30d down down down down miR-467c - - - - miR-337 - up up - miR-466a-

3p - - - down

miR-466d-

3p - - - down

Schulz et al PNAS 2013

Page 22: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Analysis of significant miRNAs in

patients with Idiopathic Pulmonary

Fibrosis (IPF)

10 control

vs 10 IPF

(TissueBank

)

28

control

vs 33

IPF

(Tissue

Bank)

142 control vs

162 IPF

(LGRC)

bleomycin

at 14d ( 3

replicates)

Liu et al

data J. Ex.

Med. 2010 miR-125a down - - down miR-30a down down down down miR-30d down down down down miR-467c - - - - miR-337 - up up - miR-466a-

3p - - - down

miR-466d-

3p - - - down

Schulz et al PNAS 2013

From development to disease

Development

Disease

Page 23: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

E. coli

Page 24: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Escherichia coli

Regulatory Network

• Comprehensive genome wide binding data is currently not available for

most transcription factors

• Key transcription factor activate some genes and repress others

– Direction of interaction should also be part of the input to DREM

• 25% of genes have at least one known regulator based on curated

small scale experiments

– No confirmed negative data

• Expression and motif data also available

E. coli Image from Wikipedia

Page 25: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Using unlabeled data helps

supervised learning Consider setting:

• Set X of instances drawn from unknown distribution P(X)

• Wish to learn target function f: X Y

Given:

• iid labeled examples

• iid unlabeled examples

Determine:

Page 26: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Binary Logistic

Regression Classifier 3-Way Logistic

Regression Classifier

P(activated)+P(repressed) P(regulated)

Motif Exp 1 Exp 2 … Exp p Label

Gene 1 8.0 1.2 -0.5 … 0.4 activated

Gene 2 6.2 -0.4 1.0 … 2.0 repressed

Gene 3 7.0 -0.8 1.2 3.2 unknown

… … … … … …

Gene N 2.2 0.4 1.4 … -1.4 unknown

Binary Logistic

Regression Classifier “Meta” Classifier

P(regulated)

Self-training rule to change „unknown‟ labels to either activated or repressed

P(regulated) > 2 x (#activated + #repressed)/N

SEmi-supervised REgulatory Network

Discoverer (SEREND)

Expression Data from (Faith et al, PloS Biology 2007); Curated Interactions from EcoCyc; Motifs from Regulon DB

Ernst et al., PLoS Comput Biol, 2008

Page 27: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

0

No Self-Training Input Labels

+ +

? + ?

+ + + + + ? ? ? +

? +

+ ? ?

? ?

? ?

?

?

?

?

?

? ?

?

?

? ?

?

? ?

?

?

?

? ?

?

? ?

? ?

Labels for Final

Classification

+ +

0 + 0

+ + + + + 0 0 0 +

0 +

+ 0

0 0

0 0

0

0

0

0

0

0 0

0

0

0 0

0

0 0

0

0

0

0 0

0

0 0

0 0

Input Labels

+ +

? + ?

+ + + + + ? ? ? +

? +

+ ? ?

? ?

? ?

?

?

?

?

?

? ?

?

?

? ?

?

? ?

?

?

?

? ?

?

? ?

? ?

After Self-Training

+ +

+ + +

+ + + + + + + + +

+ +

+ ? ?

? ?

? ?

?

?

?

?

?

? ?

?

?

? ?

?

? ?

?

?

?

? ?

?

? ?

? ?

+ +

+ + +

+ + + + + + + + +

+ +

+ 0 0

0 0

0 0

0

0

0

0

0

0

0

0

0 0

0

0 0

0

0

0

0 0

0

0 0

0 0

Labels for Final

Classification

With Self-Training

+ regulated

? unknown

0 unregulated

Legend

The Self-Training Step

0

Page 28: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

Aerobic-anaerobic shift response in

E. coli

2 1

6

5 4 3

9 7 8

1 2

3

4

5

6 7

8

9

Time

Expression Level

(log base 10)

1 activator;

-1 repressor

Beg et al PNAS 2007, Ernst et al Plos Comp. Bio. 2008

Page 29: Reconstructing dynamic regulatory networks in …02710/Lectures/dremReg.pdfMethods for reconstructing networks in cells Amit et al Science 2009 Gerstein et al Science 2010 SLT2 CRH1

DREM is useful, but several

questions remain … Who controls the

master regulators?

Who controls the

master regulators?