the importance of enzymes and their occurrences: from the perspective of a network
DESCRIPTION
The importance of enzymes and their occurrences: from the perspective of a network W.C. Liu 1 , W.H. Lin 1 , S.T. Yang 1 , F. Jordan 2 and A.J. Davis 3 , M.J. Hwang 1 . - PowerPoint PPT PresentationTRANSCRIPT
The importance of enzymes and their occurrences: from the perspective of a network
W.C. Liu1, W.H. Lin1, S.T. Yang1, F. Jordan2 and A.J. Davis3, M.J. Hwang1.
1Institute of Biomedical Sciences, Academia Sinica, Taiwan. 2Collegium Budapest, Hungary. 3Max Planck Institute for Chemical Ecology, Germany.
Introduction
What is a metabolic network?
From http://www.genome.jp/kegg/
Nodes: compounds
Links: enzyme-catalyzed reactions
The textbook version as defined by human!
Introduction
What is a metabolic network?
The sum of all chemical transformations in a cell.
Enzyme-catalyzed reactions.
Reversible and irreversible reactions.
All living things have it!!.
So, metabolic networks consist of compounds (i.e. substrates and products) and enzymes.
Introduction
Why do we study metabolic networks?
Obviously, they are popular and fashionable topics!
Global properties and connectivity (Jeong et al 2000, Nature, 407:650)
Network organization and modules: (Ravasz et al 2002, Science, 297:1551)
All living organisms rely on metabolism, thus they are important!
Introduction
Why do we study metabolic networks?
Of course, we have better reasons than those two!Metabolic pathways are chosen due to their relative
completeness and the amount of data in comparison with other molecular networks……
Introduction
Why do we study metabolic networks?
Given the same metabolic pathway, some enzymes are present in some organisms while some are absent in others.
E. coli B. halodurans
From http://www.genome.jp/kegg/
Distribution of enzymes in 228 species of bacteria
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233
Enzymes ranked ID
0
50
100
150
200
Occ
urr
en
ce o
ut o
f 28
8 s
pe
cie
s o
f ba
cte
ria
Introduction
Why do we study metabolic networks?
Introduction
Questions
Given the same metabolic pathway, why the differential occurrence of enzymes in different organisms?
Can this difference be explained by network topology:
Are some enzymes more important topologically from the perspective of an enzyme network, while others less so?
Methodology
A metabolic network consists of several sub-networks.
As a first step, we focus on one smaller and better known component, glycolysis, then expand from there.
We focus on organisms that are comparable with each other. i.e. bacteria, due to a large number of data available.
Methodology
Methodology
Basic outline:
1. From the KEGG website (http://www.genome.jp/kegg/), for a given enzyme, we determine how many bacterial species have this particular enzyme (we do this for all enzymes). Let this be the frequency of occurrence.
2. From KEGG website we extract information on glycolsis for 228 bacterial species to construct a reference enzyme network.
A reference network is thus a summation of all 228 bacterial species.
We assume that a reference pathway contains all of the biologically possible nodes and links in 228 bacterial species.
3. Determine the topological importance (to be defined later) of enzymes.
4. Analyze results from 2 and 3, test whether topological importance is associated with frequency of occurrence.
Methodology
Details:
How to define a link between enzymes?
If two enzymes are involved in successive reaction steps, then a link between those two particular enzymes.
Consider an hypothetical reaction:
Assumption: we ignore link directions.
A B CE1 E2
E1 E2
Methodology
Details:
How to define importance or topological properties?
Several network indices:
Degree or connectivity,
Closeness centrality,
Betweenness centrality.
Methodology
Details:
Degree, D
The number of direct neighbors
D=1
D=4
D=2
Methodology
Details:
Closeness centrality, C (a simplified version)Ci measures how short the shortest paths are from node i to all nodes
where i≠j,dij is the length of the shortest path between nodes i and j in the network,
N is the number of nodes.
N
jiji dCC
1
Methodology
Details:
Between-ness centrality, BBi measures how often a node i occurs on all shortest paths between two nodes
)2)(1(
/)(2
NN
gig
BCjk
kjjk
i
where i ≠ j and k, gjk is the number of equally shortest paths between nodes j and k,
gjk (i) is the number of these shortest paths to which node i is incident,
N is the number of nodes.
Methodology
Details:
1. Rank enzymes according to topological importance values and frequency of occurrence.
2. Rank correlation test.
Results
Frequency of occurrence:
We searched the KEGG website for enzymes involved in glycolysis for 228 bacterial species.
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233
Enzymes ranked ID
0
50
100
150
200
Occ
urr
en
ce o
ut o
f 28
8 s
pe
cie
s o
f ba
cte
ria
Results
Enzyme network for glycolysis:
33 Nodes (Nenzyme), 61 Links (Lenzyme)
Results
Enzyme Network: Topological Importance: Degree
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233
Rank
0
2
4
6
8
Deg
ree
Results
Enzyme Network: Topological Importance: Closeness Centrality
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233
Rank
120
140
160
180
200
220
Clo
sene
ss C
entr
ality
Results
Enzyme Network: Topological Importance: Betweenness Centrality
1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233
Rank
0
50
100
150
200
250
Bet
wee
nnes
s C
entr
ality
Results
Enzyme Network: Topological Importance: Comparison with random networks
Can those topological properties tell us about the structure of glycolysis enzyme network? i.e. is it different from random networks?
We expect the importance values will be more homogeneous for a random network.
For the enzyme network, means and variances of above three properties are calculated.
Construct 1000 random networks of the same size as our enzyme network. For each network, means and variances of above three properties are calculated.
One then asks how well the random networks can reproduce the means and variances of the three topological properties of our enzyme network.
Results
Enzyme Network: Topological Importance: Comparison with random networks
Degree can not discriminate our enzyme network from random ones, while both centrality properties can!
0 1 2 3 4 5 6 7 8
Variance
0
50
100
150
200
250
Degree Betweenness centrality
500 1000 1500 2000 2500 7000 7500 8000 8500 9000
Variance
0
50
100
150
200
Results
Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values
Frequency of occurrence vs Degree
rrank=0.14p=0.42
0 5 10 15 20 25 30 35
Rank(Frequency of occurrence)
1
11
21
31
Ran
k(D
egre
e)
Results
Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values
Frequency of occurrence vs Closeness centrality
rrank=-0.56p=0.0009
0 5 10 15 20 25 30 35
Rank(Frequency of occurrence)
0
10
20
30
Ran
k(C
lose
ness
cen
tral
ity)
Results
Enzyme Network: Analysis: Correlation between frequency of occurrence and topological importance values
Frequency of occurrence vs Betweenness centrality
rrank=0.63p=0.0003
0 5 10 15 20 25 30 35
Rank(Frequency of occurrence)
0
10
20
30
Ran
k(B
etw
eenn
ess
cent
ralit
y)
What have we learned so far….
Degree or connectivity is not so useful as it only considers very local information.
From the enzyme network, other semi global measures of topological importance seem to correlate with the frequency of occurrence.
Our conclusions are only from a topological point of view, at least for glycolysis and 228 bacterial species.
We are aware of the simplicity of our models and assumptions.
The need to look at other metabolic pathways, and ultimately the whole metabolic network.
From glycolysis to carbohydrate metabolism
339 enzymes1106 interactions
Blue nodes are those involved in glycolysis
The carbohydrate enzyme network
From glycolysis to carbohydrate metabolism
Similarly to the findings from previous section, closeness and betweenness centralities are good topological properties that tell us our enzyme network is far from random.
Furthermore, degree can do the same too!
6 7 8 9 10 60 61 62 63 64
Variance
0
100
200
300
400
1 6 11 16 21 26 31 36 41 46 51
Degree
0
10
20
30
40Degree
From glycolysis to carbohydrate metabolism
Rank correlations between frequency of occurrence and topological properties
Glycolysis enzyme network
Carbohydrate enzyme network
F.O.C vs Degree rrank=0.14, p=0.42 rrank=0.39, p<0.01
F.O.C vs Close. Cen. rrank=-0.56, p=0.0009 rrank=-0.26, p<0.01
F.O.C vs Betw. Cen. rrank=0.63, p=0.0003 rrank=0.42, p<0.01
From glycolysis to carbohydrate metabolism
Contrast to our trial findings, degree here seems to be a good network property that correlates with the frequency of occurrence.
Why is degree important?Probably due to preferential attachment during network evolution.
Closeness and Betweenness centralities still behave in the same manner as before but to a lesser extent.
Thus, the size or scaling of networks might be influential.
From glycolysis to carbohydrate metabolism
Looking at the glycolysis enzyme network from the perspective of carbohydrate metabolism.
The carbohydrate enzyme network
Glycolysis is no longer a closed system.
We identify enzymes that are involved in glycolysis, and have their topological properties determined while treating glycolysis as a sub-network connected to the whole carbohydrate metabolism.
From glycolysis to carbohydrate metabolism
Rank correlations between frequency of occurrence and topological properties
Glycolysis enzyme network
Glycolysis enzyme networkSub
F.O.C vs Degree rrank=0.14, p=0.42 rrank=0.08, p=0.65
F.O.C vs Close. Cen. rrank=-0.56, p=0.0009 rrank=-0.22, p=0.21
F.O.C vs Betw. Cen. rrank=0.63, p=0.0003 rrank=0.31, p=0.08
Sub as a sub-network of the crabohydrate enzyme network
From glycolysis to carbohydrate metabolism
Thus, our results depend on whether the glycolysis enzyme network is a closed/isolated system, or as a part of the large carbohydrate enzyme network.
One explanation is that an enzyme might appear in different pathways, such that distant enzymes will be brought in the vicinity of each other in the network; or an enzyme might be important topologically in one pathway, but less so in others.
Should the same enzyme appearing in different pathways be treated as the same node? Further investigation is required.
Conclusion
Topological properties might play a role in the conservation of different enzymes in different bacteria species.
Betweenness centrality is probably an important property! It might identify enzymes that occupy network positions such that metabolites can be converted one to another efficiently, and also identify redundant enzymes (by-pass) that occupy more central positions in the network.
Due to the size/scale issue, there is a need to expand our enzyme network to the whole metabolic network.
Further investigations are required to examine the relationship between larger networks and their components (i.e. sub networks).
Thank you for your attention