a new method for assembling metabolic networks, with...

J. theor. Biol. (2001) 208, 361}382doi:10.1006/jtbi.2000.2225, available online at http://www.idealibrary.com on

A New Method for Assembling Metabolic Networks,with Application to the Krebs Citric Acid Cycle

JAY E. MITTENTHAL*-, BERTRANDCLARKE?, THOMAS G. WADDELLA AND GLENN FAWCETTE

*Department of Cell and Structural Biology, ;niversity of Illinois, 601 S. Goodwin St., ;rbana,I¸ 61801, ;.S.A., -Department of Statistics, ;niversity of British Columbia, ¸eonard S. Klink Building,<ancouver, BC, Canada <6¹ 1Z2, ADepartment of Chemistry, ;niversity of ¹ennessee at Chattanooga,

Chattanooga, ¹N 37403, ;.S.A. and ECaracal Consulting Inc., d106-1670=est 8th Ave,<ancouver, BC, Canada <6J 1<4

(Received on 19 April 2000, Accepted in revised form on 28 October 2000)

To understand why a molecular network has a particular connectivity one can generate anensemble of alternative networks, all of which meet the same performance criteria as the realnetwork. We have generated alternatives to the Krebs cycle, allowing group transfers andB12

-mediated shifts that were excluded in previous work. Our algorithm does not usea reaction list, but determines the reactants and products in generic reactions. It generatesnetworks in order of increasing number of reaction steps. We "nd that alternatives to theKrebs cycle are very likely to be cycles. Many of the alternatives produce toxic or unstablecompounds and use group transfer reactions, which have unfavorable consequences. Althoughalternatives are better than the Krebs cycle in some respects, the Krebs cycle has the mostfavorable combination of traits.

( 2001 Academic Press

1. Introduction

To characterize a molecular network onespeci"es its topology*the connectivity amongreactions*and its kinetics*the values of rateconstants that govern the time-dependence ofconcentrations. To understand why a molecularnetwork has a particular topology one can gener-ate an ensemble of alternative networks that meetthe same performance criteria as the real networkbut that have diverse topologies. The alternativesmay be useful for exploring the evolution or theoptimality of real networks, or for designing net-works to new speci"cations. The alternatives mayalso be useful to experimenters for generating

-Author to whom correspondence should be addressed.

0022}5193/01/030361#22 $35.00/0

hypotheses about unknown aspects of a networkunder investigation. Here we present a newmethod for generating an ensemble of metabolicnetworks and use it to study the optimality of theKrebs citric acid cycle.

1.1. ALGORITHMS FOR ASSEMBLING NETWORKS

Algorithms to assemble networks from reac-tions have been proposed and used for metabol-ism and signal transduction (Mavrovouniotiset al., 1990; Happel et al., 1990; Mittenthal, 1996;Nun8 o et al., 1997). These algorithms require ex-plicit lists of allowed compounds and reactions.The algorithms can accomodate local con-straints}restrictions on individual compoundsor reactions. For example, a compound may be

( 2001 Academic Press

362 J. E. MITTENTHAL E¹ A¸.

required, allowed, or excluded as an input, out-put, or intermediate of networks that the algo-rithm generates.

In addition to local constraints there may alsobe global constraints, such as minimizing thenumber of reaction steps in the network or thenumber of kinds of enzymes used. The precedingalgorithms have to generate all possible networksthat meet the local constraints, even if most ofthese networks are far from meeting the globalconstraints. So, it is desirable to develop algo-rithms that are responsive to global as well aslocal constraints. Furthermore, the compoundsand reactions in molecular networks other thanmetabolism are often incompletely known. Itwould be useful to construct networks with gen-eric reactions, in which the operation of the algo-rithm places constraints on reactants, products,and topology. With this motivation we haveimplemented a new algorithm that uses genericreactions rather than a reaction list, and thatgenerates networks with fewer reaction stepsearlier. We evaluate the alternative networksusing performance criteria.

1.2. PERFORMANCE CRITERIA FOR THE KREBS CYCLE

We have used our method to generate alterna-tives to the Krebs cycle. Previous research hasspeci"ed some performance criteria that theKrebs cycle probably meets. The number of reac-tion steps is likely to be minimal, because a net-work with fewer steps tends to use less of thegenome's limited coding capacity (Kirkwoodet al., 1986), to use less of the limited concentrationof proteins in the cytoplasm (Brown, 1991), andto have a greater #ux for given concentrations ofthe input and output metabolites (MeleH ndez-Hevia et al., 1994). Given these concentrations,the rate at which the network produces ATP(adenosine 5@-triphosphate) is likely to be maxi-mal. Heinrich et al. (1997) and MeleH ndez-Heviaet al. (1997) have shown that in a linear pathwaythe rate of producing ATP is maximal if reactionsnear the beginning of the pathway are exergonicand those near the end are endergonic, as inglycolysis. This expectation seems reasonable fora cyclic pathway, taking the reaction joining thesubstrate to a feeder from the cycle as the "rststep in the pathway.

The Krebs cycle produces energy for cellularmetabolism, but also makes carboxylic acids thatare substrates for the biosynthesis of amino acids.In such a multifunctional pathway, reuse of reac-tion steps in meeting multiple constraints is animportant criterion of performance. Reuse tendsto reduce the number of steps needed to meet allthe constraints, and increases the selection pres-sure for retaining steps.

1.3. ASSESSING THE OPTIMALITY OF

METABOLIC PATHWAYS

To "nd the best alternatives in a large en-semble of alternative networks, there are severalapproaches. One can use evolutionary computa-tion to mutate and select alternative networks, ashas been done for glycolysis (Heinrich et al., 1999;Stephani et al., 1999). One can classify the pos-sible networks, list the possible mechanisms forimplementing each class of network, and evaluatethe chemical feasibility and biological plausibilityof each mechanism. This approach has been usedto demonstrate the optimality of the pentosephosphate pathway (MeleH ndez-Hevia, 1990;MeleH ndez-Hevia et al., 1994; Mittenthal et al.,1998), glycolysis (Heinrich et al., 1997; MeleH ndez-Hevia et al., 1997; Stephani & Heinrich, 1998).Using classi"cation and evaluation, MeleH ndez-Hevia et al. (1996) concluded that the Krebs cycleis optimal in having the fewest steps and thegreatest yield of ATP. We have generated a largeensemble of alternatives to the Krebs cycle, ex-panding the set of allowed reactions to includereactions not admitted by MeleH ndez-Hevia et al.(1996)*rearrangements mediated by vitamin B

12and group transfer reactions.

2. Overview of the Method

Here we summarize our method. The appendixprovides details.

2.1. CONSTRAINTS ON STOICHIOMETRY

AND COMPOUNDS

The method constructs alternatives to ametabolic network that mediates speci"c overallreactions, which are expressed as stoichiometricconstraints. We assume that any alternative tothe Krebs cycle mediates three overall reactions:

ASSEMBLING ALTERNATIVE NETWORKS 363

(I) Convert pyruvate to three CO2

molecules.That is, convert a three-carbon compound tothree one-carbon compounds: 3P1#1#1.(II) Convert two pyruvates to 2-ketoglutarateand CO

2: 3#3P5#1. (III) Convert two

pyruvates to oxalacetate and two CO2:

3#3P4#1#1. Reactions I}III use pyruvateto produce energy, and to produce four- and"ve-carbon skeletons for the biosynthesis ofamino acids.

2.2. GENERIC REACTIONS AND C-NETS

Our procedure does not use a list of reactions,but instead uses generic reactions (g-reactions)with two inputs and two outputs. The inputs andoutputs are unspeci"ed initially, but become in-creasingly characterized through the procedure.The rationale for using a g-reaction as the ele-mentary process is that it represents a binaryinteraction. In general, any set of interactionsamong molecules is decomposable to a networkof binary interactions.

A g-reaction may represent a cleavage; then ithas one null input, so it converts one input to twooutputs. Or, a g-reaction may represent an addi-tion; then it has a null output, so it converts twoinputs to one output. If both inputs and both

FIG. 1. C-nets from the Krebs cycle that meet stoichiometricI (left), and in a simpli"ed form for I (right), II, and III. The num2"acetate; 3"pyruvate; 4"succinate, fumarate, malate, orRecurrents for 1-carbon compounds are not shown in this and

outputs are non-null, the g-reaction is a grouptransfer reaction of the form

A#B}GPA}G#B.

A network of g-reactions with speci"ed connect-ivity and with the number of carbon atoms ineach compound determined will be called aC-net. Figure 1 shows the C-nets for the portionsof the Krebs cycle that meet constraints I}III.Note that the C-net for constraint I has a recur-rent compound*a compound used early in thenetwork that must be produced later or suppliedexogenously. Each recurrent in a network formsa cycle.

2.3. CONSTRUCTING AN ENSEMBLE OF C-NETS

FOR EACH STOICHIOMETRIC CONSTRAINT

The method constructs an ensemble of C-netsthat are compatible with each stoichiometricconstraint, as follows.

(1) Let N be the number of g-reactions. In-crease N, starting with the smallest N that canmeet the stoichiometric constraint. For each N,determine all possible connectivities amongg-reactions.

constraints I}III. Generic reactions are shown explicitly forbers represent the number of carbon atoms in compounds:oxaloacetate; 5"a-ketoglutarate; 6"citrate or isocitrate.subsequent "gures.

FIG. 2. Paranet for the Krebs cycle. Numbers designatecompounds as in Fig. 1. An asterisk ( * ) preceding a com-pound means that there are alternative sources for it; anasterisk following a compound signi"es alternative sinksconsuming it. The 4 with a * preceding it may be synthesizedfrom 3#1, or obtained from a recurrent 4. The 4 and5 followed by a * may be used as shown, or pass intopathways that are not shown.


(2) For each connectivity, assign inputs andoutputs from the stoichiometric constraint toparticular g-reactions in all possible ways. Set allremaining inputs and outputs of g-reactions tozero.

(3) In each of the resulting networks, deter-mine the number of carbon atoms in the inputsand outputs for each g-reaction.

(4) In the preceding steps, avoid the followingconditions in any g-reaction:

(4A) The inputs must not be the same as theoutputs, because such a g-reaction does not ac-complish any transformation.

(4A1) One input must not be the same as oneoutput, because the other input must then be thesame as the other output.

(4A2) (0, 0) and (0, 1) are not allowed as inputsor outputs.

(4B) One input must not be greater than thesum of both outputs, or one output greater thanthe sum of both inputs. Such g-reactions wouldviolate the conservation of carbon atoms.

(5) Remove all but one representative of eachset of isomorphic C-nets. To de"ne the isomor-phism of two C-nets, let us regard a C-net as a setof g-reactions, each with its own label. Two C-nets are isomorphic if and only if the g-reactionsin one of them can be relabeled to coincide withthe g-reactions in the other.

(6) Remove futile cycles in C-nets. A futilecycle is a subset of the reactions of a connectedC-net that has the null reaction, 0P0, as its netreaction. Appendix A, Section A.1.6 gives ourmethod of "nding futile cycles. Section 3.1 givesour rationale for removing futile cycles.

2.4. CONSTRUCTING AN ENSEMBLE OF

PARANETS COMPATIBLE WITH

ALL STOICHIOMETRIC CONSTRAINTS

The above procedure produces a sequenceof C-nets for each stoichiometric constraint,

ordered by increasing N. If there are severalconstraints, the next step is to produce a se-quence of networks that meet all the constraints,again in order of increasing N. We do this bylooking at the intersections of C-nets producedfrom the individual constraints. When we com-bine these C-nets the resulting networks are notC-nets, strictly speaking; we call them paranets.In a paranet, a given compound can be used intwo or more di!erent ways, depending on theconstraint the paranet is meeting. Figure 2 showsthe paranet for the Krebs cycle.

The paranets generated so far are C-paranets;they are assembled from C-nets, and theyrepresent each compound only by the numberof carbons in its skeleton. By hand, we elab-orated the best C-paranets into realisticnetworks of biochemical compounds*R-paranets.

2.5. EVALUATING PARANETS

Criteria are needed for choosing which of themany C-paranets to convert to R-paranets. Pre-vious studies of metabolic networks, cited above,suggest that better paranets will tend to havefewer g-reactions and to reuse g-reactions inmeeting more than one stoichiometric constraint.Our measure of reuse in a paranet is

P"

total number of reactions in all networks of the paranet(number of constraints)]N

.


P ranges from 1/(number of constraints) if allnetworks of the paranet are disjoint, to one if allthe networks are identical.

To choose C-paranets for conversion to R-paranets we used the quality Q"N/P, becausefavorable paranets have small N and/or large P.For a C-paranet the quality is Q

C"N

C/P

C.

A better paranet has a smaller value of QC; it has

fewer reactions and/or uses more reactions thathelp to meet at least two of the stoichiometricconstraints.

2.6. CONVERTING A C- TO AN R-PARANET

The g-reactions of a C-paranet redistribute thecarbon atoms in the reacting metabolites. Each ofthese metabolites is only speci"ed by the numberof carbon atoms it contains. Conversion of a C-to an R-paranet proceeds through the followingstages, here as in our work on the pentose phos-phate pathway (Mittenthal et al., 1998).

(a) For each metabolite of every g-reaction ina C-paranet, specify the arrangement of carbonatoms as a carbon skeleton. In a paranet twometabolites with the same number of carbonatoms can be assigned di!erent carbon skeletons.

(b) Each g-reaction then mediates a trans-formation of carbon skeletons. Choose an en-zyme compatible with this transformation, andassign the functional groups required for opera-tion of the enzyme to the relevant carbon atoms.If the same g-reaction occurs more than once ina paranet, di!erent exemplars of it can use di!er-ent enzymes and functional groups.

(c) Typically, additional reactions must thenbe added to the network. In general, a sequenceof these connector steps converts the output ofone g-reaction to the input of another or toa "nal product, without altering carbon skel-etons. Thus, two or more metabolites with thesame carbon skeleton can have di!erent patternsof functional groups, in a sequence of connectorsteps.

(d) Some functional groups may not yet bespeci"ed at this stage. We assigned these func-tional groups so as to minimize the total numberof functional group changes, compatible with theassignment of enzymes and with constraints oninput and output metabolites of the R-paranet.

Through these stages we converted C- to R-paranets*networks of realistic reactions amongmetabolites with all functional groups speci"ed.We used our knowledge of enzymes that changecarbon skeletons by cleavage, addition, or grouptransfer reactions, and that modify functionalgroups without changing a carbon skeleton.In implementing this process for the Krebs cycleour aims were those of MeleH ndez-Hevia et al.(1996):

(1) Use pyruvate as the three-carbon input,CO

2as the one-carbon output, oxalacetate as the

four-carbon output, and 2-ketoglutarate as the"ve-carbon output.

(2) Use few steps: try to minimize the numberof steps in the R-paranet, N

R, by the choice of

enzymes for g-reactions and by minimizing thenumber of changes in functional groups betweeng-reactions. Note, in counting steps we regarda compound that does not leave the active site ofan enzyme as occurring within a single step. Forexample, in the Krebs cycle, citratePaconi-tatePisocitrate. Aconitate does not leavethe active site; it is an intermediate in themechanism of the aconitase reaction, which wetreat as one step (Zubay, 1998, p. 257). In anotherexample from the Krebs cycle, succinylCoAPsuccinate involves succinyl acyl phos-phate as an intermediate that does not leave theactive site.

(3) Reactions should be plausible in terms oforganic chemistry and biochemistry.

(4) Use fragments of known biochemical path-ways to go between compounds, where possible.

(5) Avoid toxic, unstable, and reactive com-pounds. Toxic compounds included acetoin, for-maldehyde, and acetaldehyde. Some compounds,such as 1,2 diketones, are unstable and may de-compose to toxic compounds.

(6) Avoid compounds for which processingrequires many steps. Where possible, avoidhydroxylations, which require NADH(nicotinamide adenine dinucleotide, reducedform) and so rob the cell of three ATPs, andwhich involve free radicals. (Compounds withmethyl groups require many steps and hy-droxylations, in pathways with oxaloacetate anda-ketoglutarate.)


We allowed two classes of reactions not admit-ted by MeleH ndez-Hevia et al. (1996):

(7) Allow use of vitamin B12

for 1,2-H,OHshifts and 1,2-H,COOH shifts.

(8) Allow group transfer reactions. These areorganically feasible, and they occur in the meta-bolism of carbohydrates but not of carboxylicacids.

In cases where a compound had six or sevencarbon atoms, many alternative choices for thetopology of the carbon skeleton and the arrange-ment of functional groups were often possible.Our choices were reasonable but may not havebeen optimal, in terms of minimizing number ofsteps, maximizing stability of compounds, and soforth.

With these considerations, a C-paranet withN

Cg-reactions and metric P

Cproduces a most

favorable R-paranet with NR

g-reactions, reuseindex P

R, and quality Q

R"N

R/P

R.

3. Results

With a computer program that implementedthe algorithms in Appendix A, we initially ob-tained a list of the best 100 C-paranets that had"ve or fewer g-reactions and six or fewer carbonatoms per compound. Many of the resulting R-paranets had unfavorable properties. Conse-quently, we excluded some classes of C-paranets,as discussed below, and examined the bestallowed 100 C-paranets.

3.1. CLASSES OF PARANETS EXCLUDED

A recurrent compound is a compound gener-ated within a metabolic network but required inan earlier reaction to run the network. The avail-able concentration of a recurrent limits the #uxthrough the network. A network is less likely toevolve, the more recurrents it requires, becausean adequate source for each recurrent mustevolve with the network. We recognized recur-rents using the algorithm described in AppendixA, and discarded C-paranets with two or morerecurrents.

Many of the best 100 paranets had two ormore group transfer reactions. Group transfer

reactions within monosaccharides, such as trans-aldolase and transketolase mediate, are very easyand convenient. However, similar reactions onnon-carbohydrates are clumsy at best, and theylead to very inconvenient structures and longerpathways. In this work, some group transferg-reactions could not be converted to reason-able biochemical reactions, given reasonablesubstrates. Among these group transferg-reactions are 3#1P2#2, 2#2P3#1,5#3P4#4, and 5#1P4#2. Some grouptransfer reactions produce compounds thatare toxic or unstable, or that have methyl groups.Some produce compounds with di$cult-to-manipulate structures or with branching struct-ures, elimination of which requires multisteprearrangements. Furthermore, group transfersare not oxidations, and so will not yieldenergy. In view of these drawbacks, we ignoredall C-paranets with two or more grouptransfer reactions. The majority of suchC-paranets in the best 100 also had two or morerecurrents.

Futile cycles can be added to C-nets, and thusto paranets, in an in"nite variety of ways. Weeliminated futile cycles to keep the resultsmanageable. For the constraints of six or fewerg-reactions, six or fewer carbon atoms percompound, 0 or 1 recurrents, and 0 or 1 grouptransfer reactions, eliminating futile cyclesreduced the estimated number of C-paranets tobe generated by a factor of 470, from 1.2]1010to 2.5]107.

Because we excluded futile cycles, the forwardand reverse reactions of a reversible reaction arenot both represented explicitly within any of theC-nets that we generated. However, a reactionin an R-net made from a C-net can be reversible.The reversible reaction runs in either theforward or the reverse direction, as the concen-trations of reactants determine through massaction. Operating in this way, according tothermodynamics, a reversible reaction is not afutile cycle in which one reaction undoes whatanother does.

Because we ignored C-nets with futile cycles,we did not generate some R-nets that lack futilecycles. For example, MeleH ndez-Hevia et al. (1996)presented an alternative to the Krebs cycle intheir Fig. 6. The C-paranet for this alternative has

2


a futile cycle:

2 2'6!6(

4 4

!2'4!4(

2

11

3!3(2

So, our computation did not generate this C-net.

3.2. THE BEST 100 R-PARANETS WITH

ZERO OR ONE GROUP TRANSFER REACTION AND

ZERO OR ONE RECURRENT

After we discarded all C-paranets with two ormore group transfer reactions or with two ormore recurrents, there were 4.5]107 allowed C-paranets. All used six or fewer g-reactions, andused compounds with seven or fewer carbonatoms. We examined the best 100 of these al-lowed C-paranets. Among the "rst 50 the poorerones, d12}d50, had one group transfer reac-tion. Five of these 39 C-paranets could not beconverted to R-paranets because they had anunsuitable group transfer reaction. Four of the 39had no recurrent, and so were not cycles. Of thesefour, two could not be converted, one producedan unstable 2,3 diketone, and one produced acet-alydehyde. For the 34 (out of 39) C-paranets thatcould be converted, the best R-paranet had 11 ormore steps (in some cases, more than 20), andQ

R'14.4. Many of these R-paranets produced

toxic compounds and/or had a poor energy yield.The poor quality of d12}d50, which had agroup transfer reaction, led us to ignore the re-maining 49 paranets with a group transfer reac-tion, which had even larger Q

C. These were

d51}d81 with six g-reactions, and d83}d100with "ve g-reactions.

Thus, we focus on C-paranets d1}d11 andd82, which are shown in Fig. 3. Paranet d8is the Krebs cycle. Paranets d1}d5, d7, andd9}d11 resemble the Krebs cycle in adding twocompounds to give a larger one, and then cleav-ing 1- or 2-carbon fragments in a sequence ofreactions. Each of the four C-paranets with onegroup transfer reaction (d1}d4) is a condensedversion of a C-paranet with no group transferreaction (d7, d8, d11, d9, respectively). d2 isthe condensed form of the Krebs cycle; the pair ofreactions 4#2P6 and 6P5#1 is condensed to

the group transfer reaction 4#2P5#1. Suchcondensation reduces Q

C, and so improves the

ranking of the condensed C-paranet relative to itsuncondensed partner. (Appendix A Section 3provides a proof of this statement.) However,such improvement does not occur when conden-sation deletes a 5-carbon compound that metconstraint II in the uncondensed paranet, as isthe case for d5 and d10 in Fig. 3. An additionalreaction then has to be added to the condensedparanet to meet this constraint, thereby increas-ing its Q

C.

For paranets d1}d11 and d82, Table 1 sum-marizes the characteristics N

C, N

R, P

C, P

R, Q

C,

and QR. The step ratio N

R/N

Cis the average

number of steps in the R-paranet needed to im-plement a g-reaction and to make the transitionto a neighboring g-reaction. Table 1 also sum-marizes the energy yield for the R-paranets, mea-sured as the number of molecules produced orconsumed throughout an R-paranet, for FADH

2(#avin adenine dinucleotide, reduced form),NADH, ATP, and GTP (guanosine 5@-triphos-phate).

The only R-paranets with QR

less than or equalto 14.4 are d2, d5, d7, d9, d82, and theKrebs cycle, d8. These competitors equal orexceed the Krebs cycle in some attributes, butnone is as good overall. All "ve of these R-para-nets join two smaller compounds to make alarger one that is converted to 2-ketoglutarate byreactions that include a B

12-mediated rearrange-

ment.

f d2 (Fig. 4) is the condensed form of the Krebscycle. It illustrates the reduction in perfor-mance that typically accompanies condensa-tion. Cleavage of pyruvate produces the toxicchemical acetaldehyde, which is used ina group transfer reaction to produce a "ve-carbon compound. d2 has 11 steps and a stepratio of 2.75, despite the condensation from5 to 4 g-reactions. Its Q

R"13.4, and it has the

same energy yield as the Krebs cycle.f d5 (Fig. 5) resembles the Krebs cycle, but

condenses acetyl CoA with pyruvate ratherthan oxaloacetate. Note that the aconitase-likereaction is counted as one step. d5 has 11 steps,a step ratio of 2.75 and Q

R"14.0. Its energy

yield (1 FADH , 4 NADH, 0 ATP, 1 GTP) is

FIG. 3. Twelve C-paranets with zero or one group transfer reaction and zero or one recurrent. The numbers in theg-reactions represent the number of carbon atoms in compounds, but do not stand for particular compounds. The d labels(e.g. d5) designate the ranking of a C-paranet according to its value of N

C/P

C. The symbol 8 designates a pair of C-paranets

in which the left-hand one is a condensed version of the right-hand one.


TABLE 1Properties of the best 11 paranets and d82. ¹he paranets are listed in the order in which they aredisplayed in Fig. 3. For a C-paranet, N

Cis the number of g-reactions and P

Cis the performance metric. ¹he

most favorable R-paranet from that C-paranet has NR

g-reactions and metric PR. Q

C"N

C/P

C;

QR"N

R/P

R.

Paranet d NC

NR

NR/N

CPC

PR

QC

QR

FADH2

NADH ATP GTP Chemistry

Four pairs of the best C-paranetsd1 3 14 4.7 0.67 0.57 4.5 24.5 1 4 0 1 Acetoin; B

12d7 4 10 2.5 0.75 0.77 5.3 13.0 1 4 0 1 Poor Keq

; B12

d2 4 11 2.8 0.83 0.82 4.8 13.4 1 4 !1 1 Acetaldehyde; B12d8 5 10 2.0 0.87 0.8 5.8 12.5 1 4 !1 1 Krebs cycle

d3 4 18 4.5 0.83 0.89 4.8 20.2 1 3 !2 0 Acetaldehyded11 5 18 3.6 0.87 0.89 5.8 20.2 1 3 !2 0

d4 4 12 3.0 0.83 0.81 4.8 14.9 1 4 !1 2 B12d9 5 11 2.2 0.87 0.82 5.8 13.4 1 4 !1 2 Poor K

eq

Condensed form lacks a 5 to satisfy constraint IId5 4 11 2.8 0.75 0.79 5.3 14.0 1 4 !1 2 Krebs-like; B

12d10 5 18 3.6 0.87 0.91 5.8 19.8 1 4 !1 0 Glyoxylic acid

No recurrentsd6 4 11 2.8 0.75 0.73 5.3 15.1 0 1 !2 0 Carboxylations

¹he only acceptable paranet among d50}d100d82 5 9 1.8 0.73 0.78 6.8 11.6 0 3 !2 0 Poor K

eq; B

12

FIG. 4. R-paranet d2. In Figs 4}8, if a sequence of reac-tions is the same as in the Krebs cycle, an arrow designateseach reaction, but only the "rst and last compounds in thesequence are named or displayed. FIG. 5. R-paranet d5.


FIG. 6. R-paranet d7.

FIG. 7 R-paranet d9.

FIG. 8 R-paranet d82.


better than that of the Krebs cycle, in that d5does not use an ATP to carboxylate pyruvateto oxaloacetate.

d7, d9, and d82 begin with an aldol additionthat has an unfavorable equilibrium constant.

f d7 (Fig. 6) has an aldol addition of twopyruvates to make a 6-carbon compound,which undergoes rearrangement and decar-boxylation to make 2-ketoglutarate. Its con-densed form is d1. d7 has ten steps, as doesthe Krebs cycle, but it has four g-reactions anda step ratio of 2.5. Its Q

R"13.0, and its energy

yield is better than the Krebs cycle, as in d5.f d9 (Fig. 7) has an aldol addition of pyruvate

and oxaloacetate to make a 7-carbon com-pound, which undergoes an aconitase-likerearrangement and two decarboxylations tomake 2-ketoglutarate. Its condensed form isd4. d9 has 11 steps, "ve g-reactions, a stepratio of 2.2, Q

R"13.4, and the same energy

yield as the Krebs cycle.f d82 (Fig. 8), as d7, converts two pyruvates to

2-ketoglutarate. The "ve-carbon skeleton is hy-droxylated, oxidized, and cleaved to pyruvateand oxalic acid*a very toxic compound. Itscondensed form is d13. d82 has nine steps,"ve g-reactions, a step ratio of 1.8, andQ

R"11.6*parameters better than the Krebs

cycle. However, its energy yield (3 NADH,!2 ATP, no FADH

2or GTP) is poorer than

that of the Krebs cycle.

The other six of the best 11 R-paranets(d1, 3, 4, 6, 10, 11) have 12 or more steps,Q

R'14.4, and a step ratio of at least 2.4. Al-

though d1 is the smallest C-paranet, and theonly one with three g-reactions, its R-paranetrequires 14 steps and its Q

R"24.5, re#ecting

a low overlap in the usage of reactions for meet-ing the three constraints (P

R"0.57). d6 has no

recurrents. It requires two carboxylations, whichuse energy.


3.2. ARE THERE OTHER GOOD ALTERNATIVES

TO THE KREBS CYCLE?

In focusing on the best 100 of the 4.5]107allowed C-paranets, we obviously selected asmall sample. Might there be good alternatives tothe Krebs cycle among the remaining paranets?Several arguments suggest that this possibilityis very unlikely. We can estimate an upperbound on Q

Cfor good alternatives, using

the relation

QC"Q

R(N

C/N

R) (P

R/P

C).

Potentially good alternative R-paranets haveQ

R)14.4. The maximum value for N

C/N

Ris

found from the minimum of the step ratio,N

R/N

C, which is unlikely to be less than its value

in d82, 1.8. The maximum of PR/P

Cis found by

taking the maximum value for PR, 1, and the

minimum value for PC, 1/3. Thus, the largest

possible value for PR/P

Cis 3. Therefore, a good

alternative is unlikely to occur among the manyC-paranets with Q

Cgreater than 24 (14.4]3/1.8).

A realistic upper bound on QC

for good alterna-tives is probably much lower.

It is also unlikely that a good alternative hasmore than "ve g-reactions. A paranet withN

C"6 is likely to have N

R'10.8, because the

minimum step ratio that we observed is 1.8. IfNR'10.8, P

Rwould have to be '0.75 to get

QR(14.4. A step ratio as small as 1.8 is extreme-

ly rare, and a PR

as large as 0.75 occurs infre-quently except among the paranets in Table 1. Bythe same argument, a paranet with N

C*7 is

likely to have NR'12.6, and so is unlikely to be

a good alternative.Among the paranets with "ve or fewer g-reac-

tions, the paranets with one group transfer reac-tion are less likely to be good alternatives thanthose with no group transfer reactions. Recall theadverse e!ect of a group transfer reaction in thebest 50 allowed paranets: d12}d50 allhad a group transfer reaction, and nonewas a good alternative. d1}d4 had grouptransfer reactions, and were condensed versionsof paranets without group transfer reactions.In these four pairs of reactions, the condensedparanet has a Q

R*the Q

Rfor the uncondensed

paranet.

These considerations led us to examine thebest 50 paranets of the 6]104 with no grouptransfer reactions, six or fewer g-reactions, andseven or fewer carbon atoms per compound.Among these, the best eight are d5}d11 andd82 discussed above. The next "ve paranets hadfour or "ve g-reactions; with Q

R's ranging from

19.5 to 33.4 they are not good alternatives. Thenext 34 paranets had six g-reactions and so werenot considered. The remaining three paranetshad "ve g-reactions; with Q

R's ranging from 22.0

to 37.4 they are not good alternatives. Thesearguments and results strengthen our con-clusions that the only alternatives to the Krebscycle worth considering are d2, 5, 7, 9 and 82.The Krebs cycle is better overall than any of theseparanets.

4. Discussion

4.1. IS THE KREBS CYCLE OPTIMAL?

We have examined favorable alternatives tothe Krebs cycle and found none that surpasses itin overall performance, though some are better incertain respects. The Krebs cycle uses no grouptransfer reactions, and so avoids their unfavor-able associations. It does not use vitamin B

12,

and it produces no toxic compounds, unlikesome favorable alternatives. With the exceptionof d82, the Krebs cycle has the minimum num-ber of steps (ten), the best value for the qualityindex N

R/P

R(12.5), and the lowest value for the

step ratio NR/N

C(2.0). With respect to these para-

meters the Krebs cycle is better than its closestcompetitor, d5 (Fig. 5). d5 resembles the Krebscycle, but adds acetyl CoA to pyruvate ratherthan oxalacetate. d5 has a slightly betterstoichiometric energy yield than the Krebs cycle,but most alternatives have a poorer yield. Thepro"le of free energy changes along the pathwaywill favor a greater rate of ATP production in theKrebs cycle than in many alternatives. In thefollowing sections, we consider some of theseissues in more detail.

4.1.1. Economy

The Krebs cycle performs well in reusing stepsto meet di!erent constraints; its reuse index, P

R,

is relatively high. This reuse is a form of economy.


Economy is also evident in the step ratio,which measures the number of steps that changefunctional groups without changing carbon skel-etons, per step that changes carbon skeletons.For the Krebs cycle the step ratio is 2.0, nearlythe minimum among the alternatives we exam-ined. This minimality also obtains for thenon-oxidative pentose phosphate pathway, forwhich Mittenthal et al. (1998) generated alterna-tives. Values of N

R/N

Cfor the most favorable

alternatives to that pathway can be calculatedfrom their Table 4, where the "rst column givesN

Cand the last gives N

R. Among those alterna-

tives the real pentose phosphate pathway has theminimum step ratio, 13/7"1.86. Thus, the stepratio is very low in the Krebs cycle and thepentose phosphate pathway, relative to alterna-tives that meet the same constraints. This may bethe case in other metabolic pathways.

Note that the ratio NR/N

Coverestimates the

step ratio. We have regarded a change in thenumber of carbon atoms in a compound asa change in its carbon skeleton, but we haveneglected the rearrangements of carbon atomsthat occur in the Krebs cycle and in many alter-natives to it. Directly counting, in an R-paranet,the number of steps that change functional groupswithout changing carbon skeletons, per step thatchanges carbon skeletons, will give a lower num-ber than N

R/N

Cif carbon rearrangements are

included among the skeleton-changing steps.

4.1.2. Constraints and Cycles

We imposed three stoichiometric constraintson alternatives to the Krebs cycle, requiring thatthey produce energy and carboxylic acids thatcan be converted to amino acids. However, theKrebs cycle is embedded in a larger network ofreactions. Glycolysis, anaplerosis, and catabol-ism of fatty acids and amino acids provide fuel forthe Krebs cycle. The Krebs cycle provides precur-sors for the synthesis of carbohydrates, aminoacids, purine and pyrimidine nucleotides, andporphyrins. We limited our investigation of alter-natives to the Krebs cycle in order to addressa manageable problem. By requiring pyruvate,oxalacetate, and 2-ketoglutarate we have takenmany of the fueling reactions into account, sincethese a keto-acids link directly to the corresponding

amino acids and to glycolysis. It was unclearhow to formulate analogous constraints for someof the larger networks that include the Krebscycle.

Evidently, the Krebs cycle is the hub for ahub-and-spoke network that can interconvertseveral key metabolites. Pathways between thehub and the key metabolites are the spokes; someof these are partially joined to form branchingpathways. We expect hub-and-spoke organiza-tion in a network that interconverts several keymetabolites because this organization is optimalin a simpli"ed model of metabolism (Mittenthalet al., 1993). It seems plausible that a hub for thebiosynthesis and degradation of several keymetabolites might contain a cycle of reactions.However, for the three constraints we used, wefound non-cyclic alternatives. Baldwin & Krebs(1981) suggested that a cycle is a more e$cientway to oxidize acetate while extracting energythan non-cyclic alternatives. They consideredonly alternative networks involving glycollate. Inagreement with their proposal, the non-cyclic al-ternatives that we found perform less well thanthe Krebs cycle.

4.1.3. Aldol Addition

In a linear pathway the rate of ATP produc-tion is maximal if reactions near the beginning ofthe pathway are exergonic, promoting #ux in theforward direction, and reactions near the end areendergonic (Heinrich et al., 1997; MeleH ndez-Hevia et al., 1997). The Krebs cycle and alterna-tive d5 have this pattern, but other alternativesthat we examined do not. As Heinrich et al. (1997)noted, two exergonic steps occur early in theKrebs cycle*an aldol addition coupled tothioester hydrolysis in the citrate}synthase reac-tion, and the isocitrate}dehydrogenase reaction.R-paranet d5 also uses an aldol additioncoupled to irreversible thioester hydrolysis. How-ever, d7, d9, and d82 do not have early exer-gonic steps; these three alternatives all begin withan aldol addition that has an unfavorable equi-librium. The latter steps in the Krebs cycle and ind5, d7, and d9*succinyl CoA synthetase, suc-cinic dehydrogenase, furnarase, and malate dehy-drogenase*each have DG in the cell near zero(Garrett & Grisham, 1995).


4.1.4. <itamin B12

In the Krebs cycle, as in some of the bestalternatives, an H,OH shift follows an aldol addi-tion. Aconitase catalyses an H,OH shift in theKrebs cycle and in d9, whereas B

12mediates the

shift in other good alternatives. In alternative d5shifts occur in both ways: an aconitase-like reac-tion makes an H,OH shift, and B

12then mediates

an H,COO~ shift. The di!erences between theseprocesses of rearrangement are striking. An acon-itase-like reaction does an easy dehydration, in-volving an acidic alpha proton, to give a stable,conjugated ene dione. Then, a favorable Michaeladdition of water to the C"C gives a resonancestabilized anion, all at the active site. (We treat thisprocess as one step.) No coenzyme is required. Bycontrast, in a B

12shift the complicated B

12coen-

zyme generates free radical intermediates and co-balt complexes of di!erent oxidation levels.

Free radical intermediates are acceptable foranaerobes with short life cycles, but are less toler-able for aerobes with longer life cycles. Most ofthe roughly 15 reactions known to require B

12occur in a few bacterial species and perform spe-cialized fermentations. B

12shifts do not occur in

primary energy-producing metabolism. In mam-malian metabolism, B

12is used to a signi"cant

extent only in the oxidation of odd-C fatty acids,which are rare, and for synthesizing methioninefrom homocysteine (Mathews et al., 2000).

4.1.5. Evolutionary Considerations

The Krebs cycle may have evolved in anaer-obic prokaryotes as two branches. The oxidativebranch from pyruvate to 2-ketoglutarate couldbe used for biosynthesis of glutamate. The reduc-tive branch from oxalacetate to succinate wouldhave oxidized NADH produced in glycolysis, re-generating NAD` without trapping pyruvate aslactate, and would have provided succinate forbiosyntheses. In this scenario, 2-ketoglutarate de-hydrogenase evolved later, allowing transforma-tion of 2-ketoglutarate to succinyl-CoA andproducing an oxidative cycle. With the evolutionof photosynthesis, photoreduced compoundswere available to drive a reductive citric acidcycle that "xes CO

2(Weitzman, 1985; Gest, 1987;

Buchanan & Arnon, 1990). An alternative scen-ario posits the early evolution of a reductive citric

acid cycle and its later cooption for oxidation(WaK chtershaK user, 1990; Morowitz, 1999).

The best alternative paranets that wefound*d2, d5, d7, and d9*could have evol-ved as two branches because they have the samekind of design as the Krebs cycle. In these para-nets the reactions in the latter half of the cyclecould have evolved from reduction of oxalacetateto succinate, and the reactions in the earlier halfoxidize pyruvate to 2-ketoglutarate. However,d82 could not have evolved in an analogousway, because its latter half could not have re-duced oxaloacetate to succinate. Moreover, itproduces oxalic acid, which is very toxic. d82 isa poor competitor; it has an initial aldol additionwith an unfavorable equilibrium, a B

12shift, an

energy-robbing hydroxylation, and a poor en-ergy yield.

Historically, might networks that used B12

have been competitors for the Krebs cycle?Morowitz (1999) proposed that metabolism evol-ved through the sequential addition of shells toa core (shell A) which consisted of the Krebscycle, glycolysis, and fatty acid synthesis. Theamination of 2-ketoglutarate was the gateway toshell B, the synthesis of most amino acids. In shellC, sulfur was incorporated into cysteine and me-thionine. The gateways to shell D, ring closureand synthesis of nitrogen and dinitrogen hetero-cycles, gave access to purines, pyrimidines, andmany cofactors, including B

12. This scenario sug-

gests that compounds in shell D evolved afterenzymes (derived from shell B) and were nota part of pre-biotic chemistry. However, Ksanderet al. (1987) showed that a pre-biotic synthesis ofa corrin template for B

12synthesis is feasible,

by synthesizing uroporphyrinogen III fromglutamine nitrile under anaerobic conditions.Methanogenic archaebacteria that make B

12date to about 3.8 billion years ago (see Scott,1993), so there has been ample time for competi-tion between the Krebs cycle and alternativesthat use B

12.

4.2. OUR METHOD

Our method in this work arose from the ideathat a list of allowed reactions is not essentialfor constructing an ensemble of alternativemetabolic networks. One can use stoichiometric


constraints on the networks' performance, andgeneric reactions without compounds speci"ed,to "nd possible network topologies and stoichio-metries of reactions, and so to generate anensemble of networks that transform carbonskeletons*C-nets. We ranked C-nets accordingto their reuse of reactions to meet more than onestoichiometric constraint. We then converted thebest C-nets to realistic networks involving fullyspeci"ed compounds*R-nets*by hand, choos-ing carbon skeletons and adding functionalgroups. As Mittenthal et al. (1998) showed for thepentose phosphate pathway, functional groupscan be added automatically to C-nets, using theprinciple that the functional groups change min-imally as the network transforms starting react-ants to "nal products. This principle is a specialcase of the principle of minimum structurechange in chemical reaction networks, knownfrom the 19th century (see Temkin et al., 1996,pp. 25}27). It seems likely that the latter principlecould be used to automate the transformationand rearrangement of carbon skeletons in theconversion of C-nets to R-nets.

Our method may be more widely useful. Webelieve that knowledge of initial and "nal com-pounds can be used with generic reactions todesign molecular networks other than those inmetabolism. In metabolic networks enzymescatalyse transformations of metabolites, but theenzymes are not transformed. However, in othernetworks macromolecules catalyse the modi"ca-tion of other macromolecules. Macromoleculesmay also associate with, and dissociate from eachother without covalent modi"cation. Such mac-romolecular networks mediate most of the pro-cesses in a cell, including signal transduction,gene regulation, the movement of processive en-zymes along "lamentous macromolecules, andthe #ow of molecules through channels in mem-branes. Because macromolecular networks arecurrently under intensive investigation, it wouldbe useful to have a method for assembling themfrom reactions that were characterized to variousextents, from generic to fully speci"ed. A suitablemethod might be an extension of our approach.

B. Clarke gratefully acknowledges funding fromNSERC Operating Grant OGP-0138122. The authorswould like to express their gratitude to Gary Catt of

Caracle Consulting Inc. for his generosity while guid-ing us through the programming labyrinth.

REFERENCES

BALDWIN, J. E. & KREBS, H. (1981). The evolution of meta-bolic cycles. Nature 291, 381}382.

BROWN, G. C. (1991). Total cell protein concentration as anevolutionary constraint on the metabolic control distribu-tion in cells. J. theor. Biol. 153, 195}203.

BUCHANAN, B. B. & ARNON, D. I. (1990). A reverse Krebscycle in photosynthesis: consensus at last. Photosynth. Res.24, 47}53.

GARRETT, R. H. & GRISHAM, C. M. (1995). Biochemistry,pp. 604}605. New York: Saunders.

GEST, H. (1987). Evolutionary roots of the citric acid cycle inprokaryotes. Biochem. Soc. Symp. 54, 3}16.

HAPPEL, J., SELLERS, P. H. & OTAROD, M. (1990). Mechan-istic study of chemical reaction systems. Ind. Eng. Chem.Res. 29, 1057}1064.

HEINRICH, R., MONTERO, F., KLIPP, E., WADDELL, T. G.& MELED NDEZ-HEVIA, E. (1997). Theoretical approaches tothe evolutionary optimization of glycolysis. Thermo-dynamic and kinetic constraints. Eur. J. Biochem. 243,191}201.

HEINRICH, R., MELED NDEZ-HEVIA, E., MONTERO, F., NUN3 O,J. C., STEPHANI, A. & WADDELL, T. G. (1999). Thestructural design of glycolysis: an evolutionary approach.Biochem. Soc. ¹rans. 27, 294}298.

KIRKWOOD, T. B. L., ROSENBERGER, R. F. & GALAS, D. J.(1986). Accuracy in Molecular Processes. New York:Chapman & Hall.

KSANDER, G., BOLD, G., LATTMAN, R., LEHMANN, C.,FRUG H, T., YIANG, Y.-B., INOMATA, K., BUSER, H.-P.,SCHREIBER, J., ZASS, E. & ESCHENMOSER, A. (1987).Chemie der a-Aminonitrile. 1. Mitteilung. Einleitung undWege zu Uroporphyrinogen-octanitrilen. Helv. Chim. Acta70, 1115}1172.

MATHEWS, C. K., VAN HOLDE, K. E. & AHERN, K. G. (2000).Biochemistry, 3rd Edn, New York: Addison-Wesley.

MAVROVOUNIOTIS, M. L., STEPHANOPOULOS, G. &STEPHANOPOULOS,G. (1990). Computer-aided synthesis ofbiochemical pathways. Biotechnol. Bioeng. 36, 1119}1132.

MELED NDEZ-HEVIA, E. (1990). The game of the pentose phos-phate cycle: a mathematical approach to study the optim-ization in design of metabolic pathways during evolution.Biomed. Biochim. Acta 49, 903}916.

MELED NDEZ-HEVIA, E., WADDELL, T. G. & MONTERO, F.(1994). Optimization of metabolism: the evolution of meta-bolic pathways toward simplicity through the game of thepentose phosphate cycle. J. theor. Biol. 166, 201}220.

MELED NDEZ-HEVIA, E., WADDELL, T. G. & CASCANTE, M.(1996). The puzzle of the Krebs citric acid cycle: assemblingthe pieces of chemically feasible reactions, and opportun-ism in the design of metabolic pathways during evolution.J. Mol. Evol. 43, 293}303.

MELED NDEZ-HEVIA, E., WADDELL, T. G., HEINRICH, R.& MONTERO, F. (1997). Theoretical approaches to theevolutionary optimization of glycolysis. Chemical analysis.Eur. J. Biochem. 244, 527}543.

MITTENTHAL, J. E. (1996). An algorithm to assemble path-ways from processes. In: Paci,c Symposium on Biocomput-ing +97 (Altman, R. B., Dunker, A. K., Hunter, L. & Klein,T. E., eds), pp. 292}303. Singapore: World Scienti"c.


MITTENTHAL, J. E., CLARKE, B. & LEVINTHAL, M. (1993).Designing bacteria. In: ¹hinking About Biology (Stein,W. D. & Varela, F. J., eds), Lecture Notes, Vol. III, SantaFe Institute Studies in the Sciences of Complexity, pp.65}103. Reading, MA: Addison-Wesley.

MITTENTHAL, J. E., YUAN, A., CLARKE, B. & SCHEELINE, A.(1998). Designing metabolism: alternative connectivitiesfor the pentose phosphate pathway. Bull. Math. Biol. 60,815}856.

MOROWITZ, H. (1999). A theory of biochemical organiza-tion, metabolic pathways, and evolution. Complexity 4,39}53.

NUN3 O, J. C., SAD NCHEZ-VALDENEBRO, I., PED REZ-IRATXETA, C.& MELED NDEZ-HEVIA, E. (1997). Network organization ofcell metabolism: monosaccharide interconversion. Bio-chem. J. 324, 103}111.

SCOTT, A. I. (1993). How nature synthesizes vitamin B12*a

survey of the last four billion years. Angew. Chem. Int. Ed.Engl. 32, 1223}1243.

STEPHANI, A. & HEINRICH, R. (1998). Kinetic andthermodynamic principles determining the structuraldesign of ATP-producing systems. Bull. Math. Biol. 60,505}543.

STEPHANI, A., NUN3 O, J. C. & HEINRICH, R. (1999). Optimalstoichiometric designs of ATP-producing systems as deter-mined by an evolutionary algorithm. J. theor. Biol. 199,45}61.

TEMKIN, O. N., ZEIGARNIK, A. V. & BONCHEV, D.(1996). Chemical Reaction Networks. Boca Raton, FL:CRC Press.

WAG CHTERSHAG USER, G. (1990). Evolution of the "rst meta-bolic cycles. Proc. Natl Acad. Sci. ;.S.A. 87, 200}204.

WEITZMAN, P. D. J. (1985). Evolution in the citric acid cycle.FEMS Symp. 29, 253}275.

ZUBAY, G. (1998). Biochemistry, 4th Edn. Boston:WC Brown.

APPENDIX A

This appendix provides additional informationabout the algorithms we used. Our program toimplement these algorithms is available on re-quest from Jay Mittenthal ([email protected])or Bertrand Clarke ([email protected]).

Recall that a generic reaction (g-reaction) hasas many as two non-zero inputs and twonon-zero outputs. These inputs and outputs areunspeci"ed initially but become increasinglycharacterized through the procedure. A g-reac-tion may have one null input, so it converts oneinput to two outputs, or a null output, so itconverts two inputs to one output. We want tolink g-reactions to form networks that representa sequence of carbon skeleton-changing reac-tions, so the inputs and outputs of each g-reac-tion will be positive integers representing thenumber of carbons in a compound. The resulting

network must be designed so that its operationwill satisfy a pre-speci"ed stoichiometricconstraint.

The overall objective of the following algo-rithm is to produce a sequence of paranets, inorder of increasing size. Each of these paranetssatis"es a "nite collection of stoichiometric con-straints. The algorithm has two phases: in phase1, we construct for each constraint an ensemble ofC-nets that meet it. In phase 2, we constructparanets by combining C-nets, one from eachensemble.

A.1. Phase 1: Construction of an Ensemblefor a Constraint

A.1.1. DEFINITIONS

The single constraint of interest can bewritten as

k+i/1

niC(i)P

k{+i/1

n@iC (i).

The notation C(i) represents any input or outputcompound containing i carbons. The number ofinputs with i carbons is n

iand the number of

outputs with i carbons is n@i. The maximum num-

ber of carbons is k in an input compound and isk@ in an output compound. Thus, conservation ofcarbons implies +n

iC(i)"+n @

iC(i).

We distinguish between speci"ed and unspeci-"ed inputs and outputs. We say an input (output)of a g-reaction is speci"ed if it is one of thek inputs (k@ outputs) of the constraint. The otherinputs and outputs of the g-reactions are unspeci-"ed initially. They may be null (zero carbons)inputs or outputs. A non-null unspeci"ed inputor output is called internal. Our procedure willconnect each internal output to an internalinput. This connection is called a link. So, thenumber of links equals the number of internalinputs or equivalently the number of internaloutputs.

Let N be the number of g-reactions in aC-net. Each of these g-reactions has two inputtips and two output tips. For each g-reactionthere is a conservation of carbon (cc) equation,because the total number of carbon atomsis the same for the inputs and outputs. For


example, the cc equation is a#b"c#d for theg-reaction

a cX

b d

This g-reaction can also be represented by aC-net table:

in out.

X: a b c d

Now, for N g-reactions there are 2N input tipsand 2N output tips. Both of these numbers of tipsare sums of three terms. For input tips 2N is thesum of the speci"ed, internal, and null inputs.Correspondingly, for output tips, 2N is the sumof speci"ed, internal and null outputs.

A.1.2. BOUNDS ON THE NUMBER OF G-REACTIONS

AND NUMBER OF LINKS

For the given constraint, we "rst identify aminimal value of N, N

min, and then give the min-

imal and maximal number of links for eachN*N

min. We denote the number of links in

a C-net by ¸.To "nd N

minnote that the g-reactions must

provide enough input tips for the number ofspeci"ed inputs, and enough output tips for thenumber of speci"ed outputs. That is, the smallestpossible value of N is N

min"1/2 max(v+n

iw,

v+n@iw). For example, in the constraint

3P1#1#1, +ni"n

3"1 and +n@

i"n@

i"3, so

Nmin

"2.We require that each C-net be connected.

A connected C-net with N g-reactions must haveat least N!1 links between the g-reactions. Thatis, ¸

min"N!1. Note that there will not, in gen-

eral, be a connected C-net with the minimal N.(Consider the constraint 4P1#1#1#1, whichhas N

min"2. All four output tips are used for

speci"ed outputs, so none remain to form links.)For a given N there is also an upper bound on

¸: ¸max

"2N!max(+ni, +n@

i). This is so be-

cause the number of links that can form is thelesser of the number of unspeci"ed input tips and

the number of unspeci"ed output tips, which is

minA2N!+ni, 2N!+n@

iB

"2N!maxA+ni, +n@

iB .

Thus, we have bounds for ¸ in terms of N:

N!1"¸min

)¸)¸max

"2N!maxA+ni, +n@

iB.

Restricting the maximum number of group trans-fer reactions, N

gtr max, decreases the maximum

number of links. This is so because each grouptransfer reaction has two inputs and two outputs,so it can form more links than other g-reactions,which have two inputs and one output or twooutputs and one input. Using a lower ¸

maxappro-

priate to the limit on Ngtr max

avoids much need-less computation.

Here we describe the upper bound on the num-ber of links. First, in the absence of extra informa-tion the upper bound on ¸ is ¸

absmax"

2N!max(+ni, +n@

i). However, we can improve

this bound by using an upper bound on thenumber of group transfer reactions, the numberof input constraints and the number of outputconstraints. This continues to be a worst-caseupper bound, equal to or larger than the actualnumber of links in all cases.

To "nd an upper bound on the number oflinks, it is enough to show how to distribute theminimum number of nulls optimally over theavailable input and output locations.

First, since a group transfer reaction has twoinputs and two outputs it has no null inputs oroutputs. Thus, if N

gtr maxis less than N, there must

be some null inputs or outputs, one null for eachof the N!N

gtr maxnon-group transfer reactions.

Second, we can reduce this number of nulls bythe absolute magnitude of the di!erence betweenthe number of input molecules and the number ofoutput molecules because this is the number of


nulls that can be placed without a!ecting thenumber of links.

Third, we subtract half the number of nullsbecause there are half as many links as there arenulls.

Finally, we get an upper bound on the numberof links of the form

maxlinks"(2*[N!Ngtr max

])

!D input}molecules!output

}molecules D

!ceil(nulls/2).

Here the factor of 2 in the "rst term on the rightacts as if each of the three-tipped g-reactionsactually has four tips, as in the formula for¸absmax

above. The number of nulls in the lastterm is the number of blanks left after assigningthe nulls that are required to have the samenumber of inputs as outputs (middle terms).

To clarify the reasoning behind this result con-sider a simpli"ed example. Suppose we have "vereactions (with ten input tips and ten output tips),three inputs and one output. We can draw this as

Inputs: K K K i i i i i i i (max links"7),

Outputs: K o o o o o o o o o,

where K indicates a molecule. We can add twonulls (labeled E, empty) without reducing themaximal number of links (connections from i's too's) because the number of nulls that can beadded without e!ect is the di!erence between thenumber of input molecules and the number ofoutput molecules. This gives

Inputs: K K K i i i i i i i (max links"7),

Outputs: K E E o o o o o o o.

Adding any one null reduces the number of linksby one, but having added one null adding asecond has no e!ect because we have alreadyremoved one end of the link. Thus, we get:

Inputs: K K KE i i i i i i (max links"6),

Outputs: K E E E o o o o o o.

It is seen that because links join inputs and out-puts, the optimal distribution of nulls assigns thesame number to inputs as outputs, leading to halfas many links as there are places for nulls.

A.1.3. PROCEDURE FOR SOLUTION IN PHASE 1

Loop over N, starting with Nmin

. For each N,loop over ¸, proceeding from ¸

minto ¸

max. For

each N and ¸ construct an ensemble of C-netsthrough the following four construction steps.

1. Put in links in all possible ways. Remove allisomorphic duplicates. Label both ends ofa link with the same letter. The letter repres-ents the number of carbon atoms in thecompounds at the link. Use a di!erent letterfor each link.

2. Put in speci"ed inputs and outputs in allpossible ways. Remove all isomorphic du-plicates.

3. Set all remaining input and output tips tozero.

4. This algorithm avoids most, but not all,isomorphism. The remaining isomorphismis removed by a duplicate-removal processwhich considers two C-nets to be identical ifthey have the same number of reactions,and if the reactions of one can be permutedin such a way that all the carbon valuesmatch those of the second. The actual con-nections of links are not compared.

At any of these stages some of the C-nets may beunacceptable for the reasons indicated in the dis-quali"er rules (4) of Section 2.3.

The preceding steps of construction label everytip of every g-reaction with the number of car-bons in the compound at the tip. The number ofcompounds at some input and output tips arespeci"ed from the constraint, and other tips arenull. The remaining tips are internal; they partici-pate in links. The number of carbons in the com-pounds at these tips can be determined by solvingthe set of N cc equations.

5. Write the N cc equations and simplify themto "nd a unique solution or a class of solu-tions (if they exist). If there is a class ofsolutions, it can be expressed in terms ofparameters equal in number to the number


of links minus the number of linearly inde-pendent cc equations.

A.1.4. EXAMPLE OF PROCEDURE FOR SOLUTION

IN PHASE 1: GLOBAL REACTION I, 3#1#1#1

Here, Nmin

"2 as noted above. For N"Nmin

,¸min

"¸max

"1 so there is only one case*that ofone link*to be considered. We proceed throughthe construction steps. The "rst step is to put inthe link between the two g-reactions. Then, thereis only one way to put in the outputs and there

are three ways to put in the inputs:

3 1 1X X

a!a 1

1 1X X

3 a!a 1

1 3 1X X

a!a 1

The "rst two of these C-nets are isomorphic, andthe third one is ruled out by D2. Setting all othertips to zero, the resulting C-net can be displayedin three forms. In the third form, X

1and X

2are

g-reactions.

3 1 0 1X X

0 a!a 1

13( 1

a!a (

1

in outX

1: 3 0 a 1

X2: a 0 1 1

From any of these forms we get two cc equations:

3"a#1,

a"1#1.

So, there is a unique solution with a"2. Here weintroduced one parameter and got one indepen-dent solution as expected.

For the case N"3, the class of solutions ismuch larger. We have ¸

min"N!1"2 and

¸max

"2N!max(1,3)"3, and for each of thetwo values of ¸ we get many candidate C-nets toexamine. It can be veri"ed that for N"3 and¸"2 all of the candidates violate at least one ofthe disquali"er rules, so that none of the candi-dates is viable.

For N"3 and ¸"3, we illustrate the applica-tion of each construction step to a subset of theC-nets generated by the previous constructionstep. Putting in the "rst two links, which arerequired to get a connected C-net, gives the fol-lowing three cases:

Case A: Case B: Case C:

in out in out in out.

X1: a X

1: a X

1: a b

X2: a b X

2: b X

2: a

X3: b X

3: a b X

3: b

In each of these C-nets the third link cannotconnect an input tip to an output tip within thesame X. A free input tip on any of the 3 X's canconnect to a free output tip on any other X. Ifa X has two free input tips, or two free outputtips, one of the tips can be ignored in making thethird link. (If the ignored tip were used, it wouldmake networks isomorphous to those using theother tip.)

In Case A, each of the X's has at least one freeinput tip and one free output tip. Hence, the thirdlink can be added by taking the 3 X's two ata time, in 3!/(2!1!)"3 ways. For each of theseways one can connect the input tip of the "rstX to the output tip of the second X, or vice versa.Thus, Case A gives six arrangements of threelinks, of which two are

d1 in outX

1: a c

X2: a b

X3: b c

d2 in outX

1: c a

X2: a b

X3: b c

.

In two of the four arrangements not shown, bothoutputs of one g-reaction are linked to both in-puts of another g-reaction. E!ectively, these twog-reactions are functioning as a single g-reaction,reducing the e!ective N of the C-net to 2. Hence,


we do not need to consider these C-nets further,because they can be elaborated from C-nets withN"2. However, it is worth noting that suchC-nets can be biologically important; forexample, in the pentose phosphate pathway twog-reactions are linked in this way.

Putting in inputs, outputs, and zeros for thesecond of these C-net tables gives the followingthree C-net tables:

d2.1 in outX

1: c 3 a 1

X2: a 0 b 1

X3: b 0 c 1

d2.2 in outX

1: c 0 a 1

X2: a 3 b 1

X3: b 0 c 1

d2.3 in outX

1: c 0 a 1

X2: a 0 b 1

X3: b 3 c 1

Note that the C-nets derived from thesetables cannot be interconverted by permutingthe variables a, b, and c because these variables

represent links at non-isomorphic locations inthe C-net.

Table A.1 implies three cc equations of whichtwo are linearly independent:

c#3"a#1,

a"b#1,

b"c#1.

With b as a parameter, the solutions area"b#1, b"b, c"b!1. For various reasons,b"0, 1, 2 do not yield viable C-nets: whenb"0, c"!1, but negative numbers of carbon

atoms are not allowed. When b"1, c"0, so notransformation occurs in the third reaction. Thesystem degenerates to two reactions, a case thatwould have been studied under N"2. Whenb"2, c"1 and a"3, so no transformation oc-curs in the "rst reaction; again the system degen-erates to two reactions. As regards larger valuesof b, the upper limit on the size of compounds is6 carbon atoms. So, b must be 3, 4 or 5. This is

a family of C-nets which can be represented ingeneral on the left and for b"5 on the right,thus,

3 1X

b!1 b#11

!b#1(b

1!b(

b!1

3 1X

4 61

!6(5

1!5(

4

The C-net on the right is the portion ofC-paranet d4 that meets stoichiometricconstraint I, 3P1#1#1, as comparison withFig. 3 shows.

A.1.5. RECURRENTS

Intuitively, a feed in a C-net is a link from anoutput of an earlier-occurring reaction to an in-put of a later-occurring one. A recurrent is a linkfrom an output of a later-occurring reaction to aninput of an earlier-occurring one. However, thesede"nitions are not satisfactory, in that earlier,later, and the apparent numbers of feeds andrecurrents depend on the order of reactions in


a diagram of the C-net, and the diagram can bedrawn in various ways.

The link between recurrents and the sequenceof reactions arises because each recurrent is ina cycle through the C-net. A cycle is a sequence ofcompounds that ends at the compound where itstarts, and has no other repeated compound. Theidenti"cation of a cycle considers only the pathfrom one compound to the next, not the g-reactions in which the compounds participate.Since any compound in a cycle can be a recur-rent, it is necessary to specify a way to chooserecurrents; we do this next.

We operationalize "nding the minimumnumber of recurrents over the connectivities ofa C-net by drawing the C-net with the g-reactionsin a horizontal line. Number the reactions arbit-rarily from 1 to N, the number of reactions in theC-net. For each link between two g-reactions,tabulate which of the g-reactions must precedethe other one in order that the link be a feedrather than a recurrent. Call this relation theprecedence for the link. A subset of the precedences

will be called incompatible if the precedences inthe subset cannot all be realized in any sequenceof its g-reactions. That is, the links in an incom-patible subset cannot all be feeds, so that one ofthese links must be a recurrent. Two incompat-ible subsets are independent, i.e. disjoint, if theydo not share any precedences. The number ofindependent incompatible subsets is the min-imum number of recurrents.

As an example, consider the following C-net.Letters designate links between g-reactions. Thesmall numbers to the right of the X's designatereactions:

3 5X

1a"4 d"2

3 a"4X

2b"1

c"3 1X

3e"2

d"2 b"1X

4e"2 c"3

It is seen there are two recurrents because there are exactly two disjoint cycles. One cycle passesleftward (lw) from g-reaction X

4to X

3, and then rightward (rw) from X

3to X

4:

lw fromX

4to X

3

rw fromX

3to X

4

c"3 outfrom X

4

P c"3 into X

3

P e"2 outfrom X

3

P e"2 inPto X

4

c"3 outfrom X

4.

The other cycle passes between X4, X

2, and X

1:

lw fromX

4to X

2

lw fromX

2to X

1

rw fromX

1to X

4

b"1 outfrom X

4

Pb"1 into X

2

Pa"4 outfrom X

2

Pa"4 into X

1

Pd"2 outfrom X

1

Pd"2 into X

4

Pb"1 outfrom X

4

.

Note that each cycle has at least one edge in eachdirection. The cycles can be represented moreconveniently as a precedence table:

link precedencea reaction 2 before reaction 1b 4 before 2c 4 before 3d 1 before 4e 3 before 4.


Evidently, the precedences for links c and e are anincompatible subset, as are the precedences forlinks a, b, and d. These two subsets are indepen-dent, so there are two disjoint cycles. The preced-ence relations re#ect the ordering within a cycleas a function of the labeling of the g-reactions.The linear sequence of reactions 4, 3, 2, 1 displaysthe minimum number of recurrents, which is two;in this sequence the recurrent links are d and e.

A.1.6. FUTILE CYCLES

As noted in Section 2.3, a futile cycle is a subsetof the reactions of a connected C-net that has thenull reaction, 0P0, as its net reaction. Withina futile cycle each output of one reaction is thesame as an input of another. We search for futilecycles by ignoring the connections among reac-tions, looking at each subset of reactions, andlooking for its net reaction. A subset is futile if, foreach type of molecule (number of carbons) occur-ring in the subset, the number of molecules pro-duced as outputs is the same as the number ofmolecules consumed as inputs. A C-net is futile ifany of its subsets are futile.

A futile subset of size 1 is a reaction in whichthe outputs are the same as the inputs. A futilesubset of size 2 has a reaction and its reverse,such as

7 6X

1

6 7X

1.

Here is a futile subset of size 3:

5 2X

3 6

1X

6 5

2X

1 3.

We ignored C-nets containing futile cycles.

A.2. Phase 2: Construction of an Ensemble ofParanets, Each of Which Meets a Collection

of Constraints

A.2.1. ALGORITHM FOR CONSTRUCTING

THE ENSEMBLE OF PARANETS

In phase one we get a "nite collection of C-netsfor each N and ¸. These can be ordered in a

sequence. Taken together, all of these sequencescomprise an ensemble of C-nets that meet astoichiometric constraint. Consequently, we canregard the entire output of phase 1 as a sequenceof C-nets.

We will construct paranets from the sequencesthat meet the several constraints of interest. Re-call that a paranet is a kind of network producedby identifying the largest possible overlapsamong "nitely many networks. We consider thecase of three constraints, since that is the numberof constraints that we assume the Krebs cyclemeets; other numbers of constraints can betreated similarly. Consider the table

constraint C-net

1 G11

, G12

, G13

,22 G

21, G

22, G

23,2

3 G31

, G32

, G33

,2

in which Gij

is the j-th C-net in the orderedsequence for constraint i. We proceed by usingtriples of the form (G

1i, G

2j, G

3k), ordered by the

sum of the number of X's in the three C-nets. Webreak ties arbitrarily. (Thus, we obtain all pos-sible sums of the numbers of X's in C-nets, onefrom each row of the table, and order these fromsmallest to largest.)

Fix a triple (G1i

, G2j

, G3k

). For each triplethere are three pairs of C-nets. Consider one suchpair*say, (G

1i, G

2j). The overlap between

G1i

and G2j

is the g-reactions shared betweenthem. It is well known that the overlap can berepresented as a "nite collection of connectedcomponents. If there is only one connectedcomponent, possibly the case of greatest interest,then we regard it as a module. It is a modulein the sense that it is used in both C-nets,which meet two constraints. For the g-reactionswithin a module, we require the connectivity oftheir occurrence in meeting the two constraintsto be isomorphic. In fact, our program onlysearches for overlaps, ignoring their connectivity.We have found empirically that overlaps ofone g-reaction are typical in smaller C-nets.Disconnected overlaps, possibly with non-isomorphic connectivities, only appear for largerC-nets.


A.2.2. ALGORITHM FOR FINDING THE OVERLAPS

To "nd the overlap between two C-nets, sayG

1and G

2, represent them by their C-net tables.

Convert each table to dictionary order. That is,within the inputs and within the outputs of eachg-reaction put the smaller entry "rst. Then,within each table, put the rows in dictionaryorder. Now, it is enough to go through the rowsof each table to "nd the cases where they match.Each case is a g-reaction in the overlap.

Note that the C-net tables we described inSection A.1.4 include the connectivity of the C-net,which the present algorithm ignores. If we want tocharacterize patterns of connectivity in the overlapsthen the present algorithm must be elaborated todetermine whether matching rows have isomorphicconnectivities. Strict module structure would re-quire them to be isomorphic. However, it is possibleto have two C-nets that share several g-reactionswith non-isomorphic connectivities. In such cases,it may or may not be possible to modify the con-nectivities to make them isomorphic.

A.3. E4ect of Condensing two g-Reactions intoone, on the Value of N

C/P

Cfor a Paranet

If a g-reaction that joins two compounds toform one is followed by a g-reaction that cleavesthe "rst output into two other compounds, thetwo g-reactions can be condensed into one grouptransfer reaction. Here we show the cases inwhich such a condensation reduces the qualityindex Q

C"N

C/P

Cfor C-paranets of interest,

thereby improving the ranking of the condensedparanet relative to its uncondensed partner.

There are two cases: "rst, suppose the initialg-reaction of the pair to be condensed producesa 5-carbon compound that meets constraint II(3#3P5#1), or a 4-carbon compound thatmeets constraint III (3#3P4#1#1). Thencondensing the two g-reactions deletes the 5- or4-carbon compound. This condensation does notreduce Q

Cbecause an additional reaction must be

added to the condensed paranet to meet con-straint II or III, thereby increasing its Q

C.

Second, suppose the initial g-reaction does notproduce a compound that meets a constraint. Now,every C-net in the paranet must either include orexclude both g-reactions (in sequence), or the grouptransfer reaction derived from them. In this second

case, let N be the number of reactions in the con-densed paranet, and let S be the total number ofreactions in all three C-nets derived from thecondensed paranet. We see that S satis"esN#2)S)3N: the lower bound obtains becauseeach g-reaction in the paranet must be used in atleast one of the three C-nets, and connectedness ofthe paranet requires that at least two of the threepossible pairs of C-nets must share at least oneg-reaction; hence, a lower bound on S is N#2. Theupper bound of 3N on S is achieved when all threeC-nets are identical and use all of the g-reactions.Now, let K be the number of C-nets derived fromthe condensed C-paranet that include the grouptransfer reaction. K may be 1, 2, or 3.

Now, we show that Quncond

/Qcond

'1*that is,the condensed paranet is better*for K, S, andN of interest. Observe that

Quncond

/Qcond

"(N#1/N)Pcond

/Puncond

"(N#1/N) (S/3N) (3(N#1)/S#K).

This follows by de"nition of the Q's in termsof the P's and the substitutions P

cond"S/3N,

Puncond

"(S#K)/[3(N#1)]. The latter followsbecause the uncondensed paranet has one moreg-reaction than the condensed paranet, and hasS#K as the total number of reactions in all threeC-nets derived from the condensed paranet.

To "nd the values of S for which the ratioQ

uncond/Q

condis greater than 1, we rearrange the

above expression to get

S'NMK/[2#(1/N)]N.

We set K"3 as the worst-case scenario. Now,for N"4 we get S'12/2.333"5.14 and thelower bound on S when N"4 is N#2"6.When N"5, we get S'15/2.2"6.81 and thelower bound on S when N"5 is N#2"7.Clearly, when the lower bound on S satis"es theabove inequality in the worst-case scenario fora given N, all other values of S also satisfy it.Thus, for N"4 and 5, Q

uncond/Q

condis greater

than 1 for all allowed values of S. However, whenN"6, S must be greater than 18/2.166"8.3 butthe lower bound on S is N#2"8, so the boundfails. For N*6 there will be, in general, values ofS for which the ratio Q

uncond/Q

cond(1.

a new method for assembling metabolic networks, with...

Documents