the importance of enzymes and their occurrences: from the perspective of a network

The importance of enzymes and their occurrences: from the perspective of a network

W.C. Liu1, W.H. Lin1, S.T. Yang1, F. Jordan2 and A.J. Davis3, M.J. Hwang1.

1Institute of Biomedical Sciences, Academia Sinica, Taiwan. 2Collegium Budapest, Hungary. 3Max Planck Institute for Chemical Ecology, Germany.

Introduction

What is a metabolic network?

From http://www.genome.jp/kegg/

Nodes: compounds

Links: enzyme-catalyzed reactions

The textbook version as defined by human!

Introduction

What is a metabolic network?

The sum of all chemical transformations in a cell.

Enzyme-catalyzed reactions.

Reversible and irreversible reactions.

All living things have it!!.

So, metabolic networks consist of compounds (i.e. substrates and products) and enzymes.

Introduction

Why do we study metabolic networks?

Obviously, they are popular and fashionable topics!

Global properties and connectivity (Jeong et al 2000, Nature, 407:650)

Network organization and modules: (Ravasz et al 2002, Science, 297:1551)

All living organisms rely on metabolism, thus they are important!

Introduction


Of course, we have better reasons than those two!Metabolic pathways are chosen due to their relative

completeness and the amount of data in comparison with other molecular networks……

Introduction


Given the same metabolic pathway, some enzymes are present in some organisms while some are absent in others.

E. coli B. halodurans

From http://www.genome.jp/kegg/

Distribution of enzymes in 228 species of bacteria

1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233

Enzymes ranked ID

0

50

100

150

200

Occ

urr

en

ce o

ut o

f 28

8 s

pe

cie

s o

f ba

cte

ria

Introduction


Introduction

Questions

Given the same metabolic pathway, why the differential occurrence of enzymes in different organisms?

Can this difference be explained by network topology:

Are some enzymes more important topologically from the perspective of an enzyme network, while others less so?

Methodology

A metabolic network consists of several sub-networks.

As a first step, we focus on one smaller and better known component, glycolysis, then expand from there.

We focus on organisms that are comparable with each other. i.e. bacteria, due to a large number of data available.

Methodology

Methodology

Basic outline:

1. From the KEGG website (http://www.genome.jp/kegg/), for a given enzyme, we determine how many bacterial species have this particular enzyme (we do this for all enzymes). Let this be the frequency of occurrence.

2. From KEGG website we extract information on glycolsis for 228 bacterial species to construct a reference enzyme network.

A reference network is thus a summation of all 228 bacterial species.

We assume that a reference pathway contains all of the biologically possible nodes and links in 228 bacterial species.

3. Determine the topological importance (to be defined later) of enzymes.

4. Analyze results from 2 and 3, test whether topological importance is associated with frequency of occurrence.

Methodology

Details:

How to define a link between enzymes?

If two enzymes are involved in successive reaction steps, then a link between those two particular enzymes.

Consider an hypothetical reaction:

Assumption: we ignore link directions.

A B CE1 E2

E1 E2

Methodology

Details:

How to define importance or topological properties?

Several network indices:

Degree or connectivity,

Closeness centrality,

Betweenness centrality.

Methodology

Details:

Degree, D

The number of direct neighbors

D=1

D=4

D=2

Methodology

Details:

Closeness centrality, C (a simplified version)Ci measures how short the shortest paths are from node i to all nodes

where i≠j,dij is the length of the shortest path between nodes i and j in the network,

N is the number of nodes.

N

jiji dCC

1

Methodology

Details:

Between-ness centrality, BBi measures how often a node i occurs on all shortest paths between two nodes

)2)(1(

/)(2

NN

gig

BCjk

kjjk

i

where i ≠ j and k, gjk is the number of equally shortest paths between nodes j and k,

gjk (i) is the number of these shortest paths to which node i is incident,

N is the number of nodes.

Methodology

Details:

1. Rank enzymes according to topological importance values and frequency of occurrence.

2. Rank correlation test.

Results

Frequency of occurrence:

We searched the KEGG website for enzymes involved in glycolysis for 228 bacterial species.

1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233

Enzymes ranked ID

0

50

100

150

200

Occ

urr

en

ce o

ut o

f 28

8 s

pe

cie

s o

f ba

cte

ria

Results

Enzyme network for glycolysis:

33 Nodes (Nenzyme), 61 Links (Lenzyme)

Results

Enzyme Network: Topological Importance: Degree

1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233

Rank

0

2

4

6

8

Deg

ree

Results

Enzyme Network: Topological Importance: Closeness Centrality

1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233

Rank

120

140

160

180

200

220

Clo

sene

ss C

entr

ality

Results

Enzyme Network: Topological Importance: Betweenness Centrality

1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930313233

Rank

0

50

100

150

200

250

Bet

wee

nnes

s C

entr

ality

Results

Enzyme Network: Topological Importance: Comparison with random networks

Can those topological properties tell us about the structure of glycolysis enzyme network? i.e. is it different from random networks?

We expect the importance values will be more homogeneous for a random network.

For the enzyme network, means and variances of above three properties are calculated.

Construct 1000 random networks of the same size as our enzyme network. For each network, means and variances of above three properties are calculated.

One then asks how well the random networks can reproduce the means and variances of the three topological properties of our enzyme network.

Results

Enzyme Network: Topological Importance: Comparison with random networks

Degree can not discriminate our enzyme network from random ones, while both centrality properties can!

0 1 2 3 4 5 6 7 8

Variance

0

50

100

150

200

250

Degree Betweenness centrality

500 1000 1500 2000 2500 7000 7500 8000 8500 9000

Variance

0

50

100

150

200

Results

Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values

Frequency of occurrence vs Degree

rrank=0.14p=0.42

0 5 10 15 20 25 30 35

Rank(Frequency of occurrence)

1

11

21

31

Ran

k(D

egre

e)

Results

Enzyme Network: Analysis: Rank correlation between frequency of occurrence and topological importance values

Frequency of occurrence vs Closeness centrality

rrank=-0.56p=0.0009

0 5 10 15 20 25 30 35


0

10

20

30

Ran

k(C

lose

ness

cen

tral

ity)

Results

Enzyme Network: Analysis: Correlation between frequency of occurrence and topological importance values

Frequency of occurrence vs Betweenness centrality

rrank=0.63p=0.0003

0 5 10 15 20 25 30 35


0

10

20

30

Ran

k(B

etw

eenn

ess

cent

ralit

y)

What have we learned so far….

Degree or connectivity is not so useful as it only considers very local information.

From the enzyme network, other semi global measures of topological importance seem to correlate with the frequency of occurrence.

Our conclusions are only from a topological point of view, at least for glycolysis and 228 bacterial species.

We are aware of the simplicity of our models and assumptions.

The need to look at other metabolic pathways, and ultimately the whole metabolic network.

From glycolysis to carbohydrate metabolism

339 enzymes1106 interactions

Blue nodes are those involved in glycolysis

The carbohydrate enzyme network


Similarly to the findings from previous section, closeness and betweenness centralities are good topological properties that tell us our enzyme network is far from random.

Furthermore, degree can do the same too!

6 7 8 9 10 60 61 62 63 64

Variance

0

100

200

300

400

1 6 11 16 21 26 31 36 41 46 51

Degree

0

10

20

30

40Degree


Rank correlations between frequency of occurrence and topological properties

Glycolysis enzyme network

Carbohydrate enzyme network

F.O.C vs Degree rrank=0.14, p=0.42 rrank=0.39, p<0.01

F.O.C vs Close. Cen. rrank=-0.56, p=0.0009 rrank=-0.26, p<0.01

F.O.C vs Betw. Cen. rrank=0.63, p=0.0003 rrank=0.42, p<0.01


Contrast to our trial findings, degree here seems to be a good network property that correlates with the frequency of occurrence.

Why is degree important?Probably due to preferential attachment during network evolution.

Closeness and Betweenness centralities still behave in the same manner as before but to a lesser extent.

Thus, the size or scaling of networks might be influential.


Looking at the glycolysis enzyme network from the perspective of carbohydrate metabolism.

The carbohydrate enzyme network

Glycolysis is no longer a closed system.

We identify enzymes that are involved in glycolysis, and have their topological properties determined while treating glycolysis as a sub-network connected to the whole carbohydrate metabolism.


Rank correlations between frequency of occurrence and topological properties

Glycolysis enzyme network

Glycolysis enzyme networkSub

F.O.C vs Degree rrank=0.14, p=0.42 rrank=0.08, p=0.65

F.O.C vs Close. Cen. rrank=-0.56, p=0.0009 rrank=-0.22, p=0.21

F.O.C vs Betw. Cen. rrank=0.63, p=0.0003 rrank=0.31, p=0.08

Sub as a sub-network of the crabohydrate enzyme network


Thus, our results depend on whether the glycolysis enzyme network is a closed/isolated system, or as a part of the large carbohydrate enzyme network.

One explanation is that an enzyme might appear in different pathways, such that distant enzymes will be brought in the vicinity of each other in the network; or an enzyme might be important topologically in one pathway, but less so in others.

Should the same enzyme appearing in different pathways be treated as the same node? Further investigation is required.

Conclusion

Topological properties might play a role in the conservation of different enzymes in different bacteria species.

Betweenness centrality is probably an important property! It might identify enzymes that occupy network positions such that metabolites can be converted one to another efficiently, and also identify redundant enzymes (by-pass) that occupy more central positions in the network.

Due to the size/scale issue, there is a need to expand our enzyme network to the whole metabolic network.

Further investigations are required to examine the relationship between larger networks and their components (i.e. sub networks).

Thank you for your attention

the importance of enzymes and their occurrences: from the perspective of a network

Documents

metabolic networks

importance of enzymes

methodologya metabolic

particular enzymes

reference enzyme network

reference network

metabolic pathways

network organization