regulatory network (part ii) 11/05/07. methods linear –pca (raychaudhuri et al. 2000) –nir...

Regulatory Network (Part II)

11/05/07

Methods

• Linear– PCA (Raychaudhuri et al. 2000)– NIR (Gardner et al. 2003)

• Nonlinear– Bayesian network (Friedman et al. 2000;

Friedman 2004)

Cell-cycle network

Data (Spellman et al. 1998)

• 76 arrays

• 7 time points

• 6177 yeast genes

• 800 cell-cycle related genes identified

PCA

Raychaudhuri et al. 2000

Raychaudhuri et al. 2000

The PCA components identify the dominant modes of variation.

Limitations of PCA

• Does not directly associate regulators with their target genes.

• Alternatively, it can be interpreted as the network is fully connected. The expression of each gene is regulated by the linear combination of all other genes.

NIR

Idea: The dynamics of gene activities can be approximated by

gene expression levels approximately reach steady state.

uAxdt

dx

perturbation

uAx

NIR• Solve for A

• This is unidentifiable since M << N.• Add constraint that there are at most k-

connections for any given gene (k < M).• For each row, use multiple regression to find a

linear combination of k-genes so that the least square error is minimal.

MNMNNN uxA

#genes #perturbations

Application of NIR

repression

activation

Known E Coli SOS pathway

Application of NIR

Regression coefficients

Limitation of NIR

• True dynamics is nonlinear.

• The choice of k is ad hoc.

• Steady state approximation does not apply to oscillatory genes.

Bayesian network

Directed acyclic graph (DAG)

• Nodes: random variables

• Edges: direct effect --- conditional dependency

Friedman 2004

An example

Earthquake Burglary

Radio Alarm

Call

This is not a Bayesian network

A

B C

A

B

C D

E

Tree: a special kind of DAG

Each node has only one parent node.

Advantage

• Intuitive --- popular among biologists

• Graph structure is easy to interpret

• Well-established probabilistic tools for DAG models.

• Support all the features for probabilistic learning– Model selection criteria– Handling of missing data

Known Structure, complete data

E B

A.9 .1

e

b

e

.7 .3

.99 .01

.8 .2

be

b

b

e

BE P(A | E,B)

? ?

e

b

e

? ?

? ?

? ?

be

b

b

e

BE P(A | E,B) E B

A

• Network structure is specified– Inducer needs to estimate parameters

• Data does not contain missing values

Learner

E, B, A<Y,N,N><Y,N,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>

(Nir Friedman)

Unknown Structure, Complete Data

E B

A.9 .1

e

b

e

.7 .3

.99 .01

.8 .2

be

b

b

e

BE P(A | E,B)

? ?

e

b

e

? ?

? ?

? ?

be

b

b

e

BE P(A | E,B) E B

A

• Network structure is not specified– Inducer needs to select arcs & estimate parameters

• Data does not contain missing values

E, B, A<Y,N,N><Y,N,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>

Learner

(Nir Friedman)

Learning parameters

E B

A

C

][][][][

]1[]1[]1[]1[

MCMAMBME

CABE

D

• Training data has the form:

Likelihood Function E B

A

C

• Assume i.i.d. samples

• Likelihood function is

m

mCmAmBmEPDL ):][],[],[],[():(

Likelihood FunctionE B

A

C

• By definition of network, we get

m

m

mAmCP

mEmBmAP

mBP

mEP

mCmAmBmEPDL

):][|][(

):][],[|][(

):][(

):][(

):][],[],[],[():(

][][][][

]1[]1[]1[]1[

MCMAMBME

CABE

Likelihood FunctionE B

A

C

• Rewriting terms, we get

m

m

m

m

m

mAmCP

mEmBmAP

mBP

mEP

mCmAmBmEPDL

):][|][(

):][],[|][(

):][(

):][(

):][],[],[],[():(

][][][][

]1[]1[]1[]1[

MCMAMBME

CABE

General Bayesian Networks

Generalization for any Bayesian network:

Parameters can be estimated independently!

iii

i miii

mn

DL

mPamxP

mxmxPDL

):(

):][|][(

):][, ... ],[():( 1

Bayesian Inference

• Represent uncertainty about parameters using a probability distribution over parameters, data

• Using Bayes rule

])[ ..., ],1[(

)()|][ ..., ],1[(])[ ..., ],1[|(

MxxP

PMxxPMxxP

• Common prior distributions:– Dirichlet (discrete)– Normal (continuous)

Why Struggle for Accurate Structure?

• Increases the number of parameters to be estimated

• Wrong assumptions about domain structure

• Cannot be compensated for by fitting parameters

• Wrong assumptions about domain structure

Earthquake Alarm Set

Sound

Burglary Earthquake Alarm Set

Sound

Burglary

Earthquake Alarm Set

Sound

Burglary

Adding an arcMissing an arc

Score based Learning

E, B, A<Y,N,N><Y,Y,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>

E B

A

E

B

A

E

BA

Search for a structure that maximizes the score

Define scoring function that evaluates how well a structure matches the data

G1

S(G1) = 10 S(G2) = 1.5 S(G3) = 0.01

G2 G3

Max likelihood params

Structure Score

Likelihood score:

Bayesian score:– Average over all possible parameter values

)θP(D|G,L(G:D) Gˆ

dGPGDPGDP )|(),|()|(

Likelihood Prior over parametersMarginal Likelihood

Search for Optimal Network Structure

• Start with a given network– empty network– best tree – a random network

• At each iteration– Evaluate all possible changes– Apply change based on score

• Stop when no modification improves score

• Typical operations:

S C

E

D Reverse C EDelete C

E

Add C

D

S C

E

D

S C

E

D

S C

E

D


• Typical operations:

S C

E

D Reverse C EDelete C

E

Add C

D

S C

E

D

S C

E

D

S C

E

D

score = S({C,E} D) - S({E} D)


At each iteration only need to score the site that is being updated !

Structure Discovery

Task: Discover structural properties– Is there a direct connection between X & Y– Does X separate between two “subsystems”– Does X causally effect Y

Example: scientific data mining– Disease properties and symptoms– Interactions between the expression of genes

Discovering Structure

– There may be many high scoring models– Answer should not be based on any single model– Want to average over many models

E

R

B

A

C

E

R

B

A

C

E

R

B

A

C

E

R

B

A

C

E

R

B

A

C

P(G|D)

P(D)

P(D|G)P(G)P(G|D)

Cell-cycle network

Friedman et al 2000

Limitations for Bayesian network

• Computationally costly– It is NP hard problem to identify the globally

optimal network structure

• Heuristic approaches may be trapped to local maxima.

• Prior distribution for DAGs is tricky.

• In practice, failure to find more difficult network structures than cell-cycle data.

Equivalence of graphs

• When two DAGs can represent the same set of conditional independence assertions, we say that these DAGs are equivalent

Y Z Y Z

• Are these graphs equivalent?

X

Y Z

X

Y Z

Are these graphs equivalent?

Therefore, the exact graph is unidentifiable!

Reading List

• Raychaudhuri et al. 2000– Apply PCA to analyze gene expression

• Gardner et al. 2003– Developed NIR to find regulatory network

• Friedman et al. 2000– Applied Bayesian network to analysis cell-

cycle network.

• Friedman 2004– Review of probabilistic graphic models.

Acknowledgement

Some of the slides are obtained from

Nir Friedman

regulatory network (part ii) 11/05/07. methods linear –pca (raychaudhuri et al. 2000) –nir...

Documents

perturbation slide

yeast genes

pca components

target genes

limitations of pca

regulatory network

cellcycle related genes

nir gardner