regulatory network (part ii) 11/05/07. methods linear –pca (raychaudhuri et al. 2000) –nir...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Regulatory Network (Part II)
11/05/07
Methods
• Linear– PCA (Raychaudhuri et al. 2000)– NIR (Gardner et al. 2003)
• Nonlinear– Bayesian network (Friedman et al. 2000;
Friedman 2004)
Cell-cycle network
Data (Spellman et al. 1998)
• 76 arrays
• 7 time points
• 6177 yeast genes
• 800 cell-cycle related genes identified
PCA
Raychaudhuri et al. 2000
Raychaudhuri et al. 2000
The PCA components identify the dominant modes of variation.
Limitations of PCA
• Does not directly associate regulators with their target genes.
• Alternatively, it can be interpreted as the network is fully connected. The expression of each gene is regulated by the linear combination of all other genes.
NIR
Idea: The dynamics of gene activities can be approximated by
gene expression levels approximately reach steady state.
uAxdt
dx
perturbation
uAx
NIR• Solve for A
• This is unidentifiable since M << N.• Add constraint that there are at most k-
connections for any given gene (k < M).• For each row, use multiple regression to find a
linear combination of k-genes so that the least square error is minimal.
MNMNNN uxA
#genes #perturbations
Application of NIR
repression
activation
Known E Coli SOS pathway
Application of NIR
Regression coefficients
Limitation of NIR
• True dynamics is nonlinear.
• The choice of k is ad hoc.
• Steady state approximation does not apply to oscillatory genes.
Bayesian network
Directed acyclic graph (DAG)
• Nodes: random variables
• Edges: direct effect --- conditional dependency
Friedman 2004
An example
Earthquake Burglary
Radio Alarm
Call
This is not a Bayesian network
A
B C
A
B
C D
E
Tree: a special kind of DAG
Each node has only one parent node.
Advantage
• Intuitive --- popular among biologists
• Graph structure is easy to interpret
• Well-established probabilistic tools for DAG models.
• Support all the features for probabilistic learning– Model selection criteria– Handling of missing data
Known Structure, complete data
E B
A.9 .1
e
b
e
.7 .3
.99 .01
.8 .2
be
b
b
e
BE P(A | E,B)
? ?
e
b
e
? ?
? ?
? ?
be
b
b
e
BE P(A | E,B) E B
A
• Network structure is specified– Inducer needs to estimate parameters
• Data does not contain missing values
Learner
E, B, A<Y,N,N><Y,N,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>
(Nir Friedman)
Unknown Structure, Complete Data
E B
A.9 .1
e
b
e
.7 .3
.99 .01
.8 .2
be
b
b
e
BE P(A | E,B)
? ?
e
b
e
? ?
? ?
? ?
be
b
b
e
BE P(A | E,B) E B
A
• Network structure is not specified– Inducer needs to select arcs & estimate parameters
• Data does not contain missing values
E, B, A<Y,N,N><Y,N,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>
Learner
(Nir Friedman)
Learning parameters
E B
A
C
][][][][
]1[]1[]1[]1[
MCMAMBME
CABE
D
• Training data has the form:
Likelihood Function E B
A
C
• Assume i.i.d. samples
• Likelihood function is
m
mCmAmBmEPDL ):][],[],[],[():(
Likelihood FunctionE B
A
C
• By definition of network, we get
m
m
mAmCP
mEmBmAP
mBP
mEP
mCmAmBmEPDL
):][|][(
):][],[|][(
):][(
):][(
):][],[],[],[():(
][][][][
]1[]1[]1[]1[
MCMAMBME
CABE
Likelihood FunctionE B
A
C
• Rewriting terms, we get
m
m
m
m
m
mAmCP
mEmBmAP
mBP
mEP
mCmAmBmEPDL
):][|][(
):][],[|][(
):][(
):][(
):][],[],[],[():(
][][][][
]1[]1[]1[]1[
MCMAMBME
CABE
General Bayesian Networks
Generalization for any Bayesian network:
Parameters can be estimated independently!
iii
i miii
mn
DL
mPamxP
mxmxPDL
):(
):][|][(
):][, ... ],[():( 1
Bayesian Inference
• Represent uncertainty about parameters using a probability distribution over parameters, data
• Using Bayes rule
])[ ..., ],1[(
)()|][ ..., ],1[(])[ ..., ],1[|(
MxxP
PMxxPMxxP
• Common prior distributions:– Dirichlet (discrete)– Normal (continuous)
Why Struggle for Accurate Structure?
• Increases the number of parameters to be estimated
• Wrong assumptions about domain structure
• Cannot be compensated for by fitting parameters
• Wrong assumptions about domain structure
Earthquake Alarm Set
Sound
Burglary Earthquake Alarm Set
Sound
Burglary
Earthquake Alarm Set
Sound
Burglary
Adding an arcMissing an arc
Score based Learning
E, B, A<Y,N,N><Y,Y,Y><N,N,Y><N,Y,Y> . .<N,Y,Y>
E B
A
E
B
A
E
BA
Search for a structure that maximizes the score
Define scoring function that evaluates how well a structure matches the data
G1
S(G1) = 10 S(G2) = 1.5 S(G3) = 0.01
G2 G3
Max likelihood params
Structure Score
Likelihood score:
Bayesian score:– Average over all possible parameter values
)θP(D|G,L(G:D) Gˆ
dGPGDPGDP )|(),|()|(
Likelihood Prior over parametersMarginal Likelihood
Search for Optimal Network Structure
• Start with a given network– empty network– best tree – a random network
• At each iteration– Evaluate all possible changes– Apply change based on score
• Stop when no modification improves score
• Typical operations:
S C
E
D Reverse C EDelete C
E
Add C
D
S C
E
D
S C
E
D
S C
E
D
Search for Optimal Network Structure
• Typical operations:
S C
E
D Reverse C EDelete C
E
Add C
D
S C
E
D
S C
E
D
S C
E
D
score = S({C,E} D) - S({E} D)
Search for Optimal Network Structure
At each iteration only need to score the site that is being updated !
Structure Discovery
Task: Discover structural properties– Is there a direct connection between X & Y– Does X separate between two “subsystems”– Does X causally effect Y
Example: scientific data mining– Disease properties and symptoms– Interactions between the expression of genes
Discovering Structure
– There may be many high scoring models– Answer should not be based on any single model– Want to average over many models
E
R
B
A
C
E
R
B
A
C
E
R
B
A
C
E
R
B
A
C
E
R
B
A
C
P(G|D)
P(D)
P(D|G)P(G)P(G|D)
Cell-cycle network
Friedman et al 2000
Limitations for Bayesian network
• Computationally costly– It is NP hard problem to identify the globally
optimal network structure
• Heuristic approaches may be trapped to local maxima.
• Prior distribution for DAGs is tricky.
• In practice, failure to find more difficult network structures than cell-cycle data.
Equivalence of graphs
• When two DAGs can represent the same set of conditional independence assertions, we say that these DAGs are equivalent
Y Z Y Z
• Are these graphs equivalent?
X
Y Z
X
Y Z
Are these graphs equivalent?
Therefore, the exact graph is unidentifiable!
Reading List
• Raychaudhuri et al. 2000– Apply PCA to analyze gene expression
• Gardner et al. 2003– Developed NIR to find regulatory network
• Friedman et al. 2000– Applied Bayesian network to analysis cell-
cycle network.
• Friedman 2004– Review of probabilistic graphic models.
Acknowledgement
Some of the slides are obtained from
Nir Friedman