metaflux in pathway tools - sri international · a .sol file: a summary of the solution found as a...

50
Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introd MetaFlux in Pathway Tools Mario Latendresse Markus Krummenacker SRI International Oct 17 – 18, 2013

Upload: duongnhi

Post on 29-May-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

MetaFlux in Pathway Tools

Mario LatendresseMarkus Krummenacker

SRI International

Oct 17 – 18, 2013

Page 2: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Outline

1 Overview of MetaFlux2 Introduction to Flux Balance Analysis

Standard LP Formulation3 Submitting an FBA Input File

Output Produced4 Genes and Reactions Knockout5 Introduction to Development Mode6 Generating a Model

Single and Multiple Gap-Filling7 The MILP Formulation

User Input: Fixed and Try Sets, Weights8 Development Mode

Methodology9 The Weights and Gap: Fine Control

Page 3: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

The FBA Tool in Pathway Tools

1 The FBA Tool, MetaFlux, was introduced in version 15.0 ofPathway Tools (Feb 2011)

2 MetaFlux has three modes: solving, development, andgene knockout

3 Solving mode: compute the fluxes of reactions to producethe biomass

4 Development mode: trying different biomass, nutrients,secretions, and reactions to create a model

5 Gene knockout: deactivating gene(s) from the model andsee the effect on growth (testing a model)

Page 4: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

What is a Flux of a Reaction?

Fluxes are rate of reactions typically expressed as mmolper gram dry weight per hour, denoted mmol/gDW/hrOther units could be used (eg, mol instead of mmol)The FBA tool does not assume any unit for the fluxes asthis is not needed to get valid resultsSolving an FBA model gives the fluxes of all reactions thatare needed to create a non-zero flux for the biomass (setof metabolites necessary for growth)

Page 5: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Creating an FBA Model vs Solving an FBA Model (1)

Creating a Flux Balance Analysis (FBA) ModelCreating an FBA model consists in curating the description ofan organism such that it represents as accurately as possiblethe in vivo reaction fluxes under certain conditions.

Solving an FBA ModelSolving an FBA model computes the reaction fluxes undercertain conditions. The model could be infeasible: no biomassproduced.

Gene KnockoutA gene knockout deactivates the reactions catalyzed by a geneand solving such a model. We can verify an FBA model bycomparing the results (growth/no growth) of a gene deletionwith experimental data.

Page 6: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Creating an FBA Model vs Solving an FBA Model (2)

Page 7: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

What is Flux Balance Analysis (FBA)?

Computing Fluxes of Reactions for Organism GrowthGiven a network of biochemical reactions, nutrients andsecretions, assign a flux (a numerical value) to every reaction toproduce a set of biomass metabolites for growth. Maximize thebiomass.

Main AssumptionsThe system is in a steady state (metabolite concentrationsdo not vary)Regulation is ignoredCofactors are ignoredCompartments are not completely taken care ofSome transport reactions must be explicitly specified (e.g.,ATP synthase)

Page 8: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Standard LP Formulation

Standard FBA Mathematical Formulation

Main Formulation

Max bbiomass

Sijv = 0where S is the stoichiometric matrix

S is a matrix where each row represents a metabolite and eachcolumn represents a reaction.bbiomass is the flux for the biomass reaction.v is a vector of variables representing the fluxes (real numbers).Reactions must be mass balanced.

Page 9: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Standard LP Formulation

An Example of an LP Formulation

The Reactions, Biomass, Nutrients C, D, E

R1 : A + 2B → BiomassR2 : C + D → 2AR3 : D + E → B

The Linear Program (LP)

A : 2R2 − R1 = 0B : R3 − 2R1 = 0C : RC − R2 = 0D : RD − R2 − R3 = 0E : RE − R3 = 0

Maximize R10 ≤ RC ,RD,RE ≤ 100

Page 10: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Standard LP Formulation

The Biomass Reaction

It is a virtual reaction representing a set of metabolites to beproduced to enable growth.

c1B1 + c2B2 + · · ·+ cnBn → Biomass

The coefficients ci are integers (positive or negative).That reaction is in the S matrix (as any other reaction).Note: all Bi metabolites must be produced by some otherreactions in the organism to satisfy the LP formulationwith a non-zero biomass.

Page 11: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Standard LP Formulation

Solving the FBA Formulation

Solving such a formulation is done by a LinearProgramming (LP) solverThere are many open source and commercial LP solvers:CPLEX, GLPK, SCIP, Gurobi, and morePathway Tools 15.0 uses SCIPEven with thousands of reactions, typical FBA/LPformulation can be solved in a few seconds

Page 12: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Syntax of the MetaFlux Input File

The input to MetaFlux is a text file (a .fba file)Metabolites are specified as list of names or frame-ids oruse all-compounds

Reactions are specified as frame-ids (common) or reactionequations (rare)The keyword metab-all is the set of all metabolicreactions in the PGDBThe keyword metacyc-metab-all is the set of allmetabolic reactions in MetaCycAlso, keywords transport-all andmetacyc-transport-all

Full documentation in the FBA Chapter of the PathwayTools’ User GuideLet’s look at the bsubcyc1.fba file

Page 13: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

A graphical user interface (GUI) is used to submit an FBAinput fileThe output files (e.g., .sol) are displayed as a text file viaa browserThe Cellular Omics Viewer can be invoked via one click toshow the resulting fluxesThe invocation of the MetaFlux GUI is under the ToolsmenuGUI demo using the bsubcyc1.fba file

Page 14: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Output Produced

Four Files Generated

Four files are generated when submitting a model.

A .lp file: the input to the SCIP solverA .log file: a trace of the output of the generation phaseand of the SCIP solver. It contains the quality of thesolution, the reactions that were filtered out (e.g.unbalanced or might be unbalanced).A .sol file: a summary of the solution found as a text file.This is the file to look at first. Contains metabolitesproduced, nutrients and secretions used, added reactions,fluxes for all reactions, and reactions with zero flux.A .dat file: an omics data file for the Pathway ToolsCellular Overview. Contains only reactions with non-zeroflux. Gap-filled reactions are not included.

Page 15: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Output Produced

The Solution File

The solution files list the:

fixed- and try-biomass metabolites that could be producedfixed- and try-nutrients usedfixed- and try-secretions producedadded reactions from MetaCyc (reversed or not)added reversed reactions from the PGDBreactions with their non-zero fluxremaining reactions that have zero flux

Page 16: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Output Produced

The .log File

Contains warnings and possibly error messages aboutunbalanced and instantiated reactionsEach reaction that is unbalanced or might be unbalancedis not included in the model. The list of these reactions aregiven in that file.The process of instantiation of reactions is summarize inthat fileMore on instantiation of reactions in Markus’ presentation

Page 17: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Output Produced

The .dat File

Each reaction having a flux is listed in that fileThe file can be used as is by the Cellular Omics Viewer ofPathway ToolsThe file can be used on the Web or the Desktop modeThe Cellular Omics Viewer is directly accessible from theMetaFlux GUI

Page 18: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Output Produced

The .lp File

The .lp file is the input of the solver. The completespecification of the model is in that file.The file has three parts: the objective function (maximizeobj), the constraints (Subject to), the list of variableswith their lower and upper bounds (bounds).There are comments in that fileEach metabolite produces one constraint: the constraintcontains all reactions where that metabolite is eitherproduced or consumed. Each constraint says thatproduction and consumption of a specific metabolite mustbalanceFor MILP, the objective function can be very large (over10000 terms); for LP, this is a single term

Page 19: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Testing a Model Using Genes Knockout

Knocking Out One GeneKnocking out a gene means to deactivate the reactionscatalyzed by that geneIsozymes are taken into account

Multiple KockoutsMore than one gene might be knocked out simultaneously

Batch KnockoutsTypically, MetaFlux is used to run a batch of gene knockouts(e.g., all genes)

Page 20: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Knockout Parameters in FBA Input File (1)

The parameter knockout-genes gives the genes (namesor frame ids) to knockout, not necessarily simultaneouslyWe can specify metab-genes for all genes that areinvolved in metabolic reactions of the PGDBThe parameter knockout-nb-genes gives the number ofgenes to knockout from the set knockout-genesIf knockout-nb-genes is 1, it is a single-gene knockout,if it is 2, it is a double-gene knockout, and so onNote that a double-gene knockout for all genes is over 500thousands knockout experiments if you have, say, 1000genes

Page 21: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Knockout Parameters in FBA Input File (2)

Parameter knockout-summary-only controls thenumber of .sol files generatedIf it is yes then only one .sol is generated summarizingthe gene knockout solutionsIf it is no then a .sol file is generated for each geneknockout run, plus the summaryEach file has the suffix ’knockout-n’ where n is an integerstarting with 0Saying no could generate thousands of .sol files whenusing metab-genes for knockout-genes

Page 22: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Knockout Parameters in FBA Input File (3)

The parameter knockout-reactions can be used tospecify explicitly the reactions to deactivate withoutspecifying genesThe parameter knockout-nb-reactions is used tospecify the number of reactions to deactivatesimultaneouslyThese two parameters can be combined with theparameters for knocking out genes

Page 23: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Knockout Parameters in FBA Input File (4)

Examples of gene knockout run on EcoCyc for1 A few genes: cysN, cysD, gltX2 All metabolic genes with summary solution file only (takes

about one minute)3 All metabolic genes with all solution files generated (takes

more than one minute)

Page 24: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Minimal Nutrient Sets

Definition of a Minimal Nutrient SetA minimal nutrient set is a sufficient set of nutrients to havegrowth and from which any nutrient removed results in nogrowth. It is likely that there are many minimal nutrient sets, ofvarious sizes, for one organism and one biomass reaction.

Minimum Set of NutrientsA minimal nutrient set is not necessarily the smallest set ofnutrients. A minimum set of nutrients is a minimal set havingthe smallest number of nutrients among all minimal sets.

Page 25: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Verifying Minimal Sets for EcoCyc

Let’s assume a simple biomass reaction, and the currentset of reactions in EcoCyc: we will not add reactions toEcoCycWe can infer from biological knowledge some small sets ofnutrientsWe will need to find out which secretions are necessary touse these minimal sets

Page 26: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Finding the Right Secretions

The try-secretions parameter can specify a set ofsecretions to try. We can specify all-compounds to trythem all.The try-secretions-weight should be negative, say−10, so that it cost something to add a secretion to themodelMetaFlux will report which secretions are needed to havegrowth, given the biomass reaction, the nutrients, and theset of reactions of EcoCycMetaFlux could have no feasible solution, that is no sets ofsecretions that it can add to generate growth

Page 27: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Infeasible Formulation

FBA/LP formulation is infeasible: no solution or the onlysolution is a zero flux for the biomassInfeasibility is likely due to some metabolites in thebiomass that cannot be producedThis might be due to missing reactions, nutrients,secretions, or a combination of theseGap-filling proposes model modifications of minimalcost

Page 28: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

If Infeasible with All Secretions Tried

Assuming that even with all secretions tried, no feasiblesolution is foundWe could find out which subset of biomass metabolites canbe producedSpecify all the biomass metabolites as try-biomassSpecify a large positive weight (a gain) fortry-biomass-weight, say 1000MetaFlux can apply multiple gap-filling simultaneously

Page 29: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Controlling the Weight Parameters

A weight is a positive or negative integerA positive weight is a gain, whereas a negative weight is acostMetaFlux always try to maximize an objective function: ifsomething (e.g. secretions) has a cost, MetaFluxminimizes their use, if it has a gain, it maximizes their useor production (biomass)The absolute value of a weight is not very important, butthe relative weight values are importantConsider a gain of 1000 for one biomass metabolite vs thecost of 10 (weight -10) for a secretions. Consider a gain of10 for one biomass. What is the difference in possiblesolutions?Let’s look at a .lp MILP file to better see what ismathematically happening

Page 30: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Single and Multiple Gap-Filling

Single and Multiple Gap-Filling

Typically "Gap-Filling" Means "Completing the ReactionNetwork"

Gap-filling adds reactions from a reference database (e.g.,MetaCyc) to the FBA model to produce missing biomassModel might still be infeasible due to a lack of reactions inMetaCyc, or lack of nutrients, or secretions

Solution: Gap-Filling Extended to Important MetabolitesNutrients, secretions, and biomass metabolites can also beadded or removed. For biomass metabolites, we try to includeas many as possible while still getting a feasible solution.

Page 31: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Single and Multiple Gap-Filling

Multiple Gap-Filling

Multiple Gap-FillingMultiple gap-filling is done on reactions, nutrients, secretions,and biomass metabolites at the same time.

ObjectiveTry to add as many biomass metabolites as possible by addinga minimum number of nutrients, secretions, and reactions; andstill get a feasible solution.

UsageSpeeds curation of a PGDB. It is a technique to complete aPGDB to do standard FBA analysis.

Our multiple gap-filling extends the reaction gap-filling ideadeveloped by Costas Maranas.

Page 32: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Linear Programming Becomes Mixed-Integer Linear Programming(MILP)

The LP formulation becomes a Mixed Integer Linear Program(MILP): binary variables control the addition of reactions,nutrients, biomass, and secretions.A constraint to control the flux ri of a reaction with a binaryvariable si :

ri − si1000 ≤ 0

When si is 1, the reaction ri can have a non-zero flux. And thatsi add a cost or gain in the objective function to maximize.The biomass, secretions and nutrients can be converted intovirtual reactions.Each biomass metabolite is controlled by one reaction: nomore major constraint as in the LP formulation.

Page 33: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

User Input: Fixed and Try Sets, Weights

Fixed Sets for Multiple Gap-filling

The user provides fixed sets of reactions and metabolites “at nocost or gain”.

Set of fixed reactions to use at no cost: typically allmetabolic reactions of the PGDB are usedSets of nutrient and secreted metabolites that can be usedat no costSet of metabolites that must be produced in the biomassand for which no gain is givenAny or all of these sets might be emptyIt is recommended to start with an empty set of fixedbiomass metabolites.

Page 34: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

User Input: Fixed and Try Sets, Weights

Try-Sets and Weights for Multiple Gap-filling

The user provides four try-sets and weights to control thegeneration of the model.

Set of reactions to try to add at a cost: typically allmetabolic reactions of MetaCycSets of nutrients, secretions and biomass metabolites totry to add to the modelWeights, as integers for gain and cost, for the reactions,nutrients, secretions and biomass metabolitesTypically, adding a biomass metabolite is a gain, butadding a reaction or a nutrient is a cost. We have differentweights for different type of reactions (e.g., spontaneous, inthe taxonomic range, etc.)

Page 35: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

User Input: Fixed and Try Sets, Weights

The Objective Function for Multiple Gap-Filling (1)

The objective function to maximize is∑i

wtRi +∑

i

woRi +∑

i

wuRi +∑

i

wsRi +∑

i

wr Ri +∑

i

wrmRi∑i

wbBi +∑

i

wsSi +∑

i

wnNi −∑

j

Fj

where wt ,wo,wu,ws,wr ,wrm are weights for reactionsin taxonomic range, outside taxonomic range,unknown taxonomic range, spontaneous, reversed from PGDB,and reversed from Metacycwhere wb,ws,wn are weights for biomass, secretions, and nutrientswhere Bi ,Ri ,Si ,Ni are binary variableswhere Fj are the fluxes of reactions.

Page 36: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

User Input: Fixed and Try Sets, Weights

The Objective Function for Multiple Gap-Filling (2)

Different weights wt ,wo,wu,ws,wr ,wrm for reactionsThe wt are for reactions from MetaCyc being in thetaxonomic range of the PGDB whereas wo are forreactions outside the taxonomic range and wu for reactionsof unknown taxonomic rangeThe weight ws for spontaneous reaction should be a lownegative number and not zero to avoid bringing them all inthe modelThe wr are for reversed reactions from the PGDB that arenot reversibleSimilarly for wrm for reversed reactions from MetaCycThe term −

∑j Fj forces removal of the high-flux loops

At that stage, the real fluxes of the reactions are notoptimized. Solving the generated FBA model would givethe optimized fluxes.

Page 37: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

User Input: Fixed and Try Sets, Weights

Mixed Integer Linear Programming (MILP)

A MILP formulation is typically more difficult to solveexactly than a LP formulation due to the integer and binaryvariables. Essentially, the integer and binary variablesrequire the solver to try to solve many (e.g., thousands) ofLP cases.The solver might take forever to find the optimal solutionWe typically set a time limit to the solver, say 5 minutesMILP solvers vary widely in their performance andcapabilities

Page 38: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

The Weights: Costs and Gains

Typical WeightsAdding a biomass metabolite to the model is a gain.Adding any reaction, secretion, or nutrient has a cost.That corresponds to the usual goal: generating as manybiomass metabolites as possible with the minimum numberof nutrients, secretions, and added reactions

VariationsBut other scenarios are useful: use as many nutrients andsecretions as possible

Selecting the Right Weights for ReactionsThere are many different weights for the reactions: taxonomicrange, reversed, and more

Page 39: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Balancing Out Costs and Gains (1)

Example: 1000 for a biomass metabolite, -10 for areaction, -2 for a nutrient, -1 for a secretionThe solver could add as many as 100 reactions to produceone biomass metaboliteThe solver would not add 101 reactions since there wouldbe a net lostSetting the gain at 106 for one biomass metabolite wouldcertainly add as many reactions as possible to produceevery biomass metabolite since MetaCyc will have lessthan 100,000 reactions for the foreseeable future

Page 40: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Balancing Out Costs and Gains (2)

The ratio between the weight (cost) for a reaction and anutrient should also be wisely selectedExample: -10 for a reaction and -6 for a nutrient wouldbring in some reactions capable of replacing any twonutrients. This is typically undesirable. But you might beinterested to see if this is possible.

Page 41: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

The Reaction Weights

The basic weight for a reaction from MetaCycoutside the taxonomic range of the PGDB is given bytry-reactions-weightin the taxonomic range of the PGDB is given bytry-reactions-in-taxa-weightof unknown taxonomic range is given bytry-reactions-unknown-taxa-weight

A reversed reaction from MetaCyc is added with theadditional weight given bytry-reactions-reverse-try-weight. This is anadditional weight to the basic weight.A reversed reaction from the PGDB, but reversed, is addedwith weight given by try-reactions-reverse-weight

Page 42: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Suggested Simple .fba Settings

The entire biomass reaction is specified as a list ofmetabolites in the try-biomass section. No metabolitesspecified in the biomass fixed set.Similarly for nutrients and secretions: all metabolites arespecified in the try-setsParameter try-biomass-weight is set to a high value, say10000. All other weights are negative with small values(−1 to −100).No coefficients for the metabolites (biomass, nutrients, orsecretions)The reaction section says metab-all. The try-reactionssection is empty.We have try-add-reverse-rxns: noExecute this .fba file and see how many try-biomassmetabolites can be produced

Page 43: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Settings When not all Biomass is Produced

You will rarely get all the try biomass metabolites producedthe first timeIn this case, add the keyword metacyc-metab-all tosection try-reactions

ExecuteMore biomass metabolites could be produced withsuggested reactions to addAnalyze the suggested reactions to add: are they in thesame taxonomic range as the organism of the PGDB, dothey form a pathway, etc.?If the number of suggested reactions to add isoverwhelming, decrease the list of metabolites to try in thebiomass reaction

Page 44: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Other Suggested .fba Settings

If the previous settings do not provide more biomassproduced, add try-add-reverse-rxns:yes andtry-add-reverse-try-rxns:yes

You could also try to only try reversing the reactions of yourPGDB without considering the MetaCyc reactionsMake sure the list of metabolites to secrete is not toosmall: add secretions, to the try-secretion set, that youthink might be missingExecuteReally add, to your PGDB, some of the suggestedreactions to add that you think are indeed missingReally change the directionality of some of the reactions assuggested by MetaFlux

Page 45: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Controlling Which Reactions that Can Be Added

Most of the reactions are added from the set given bytry-reactions

Reversed reactions from MetaCyc might be added only iftry-add-reverse-try-rxns is yesReversed reactions from the PGDB might be added only iftry-add-reverse-rxns is yes

Page 46: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Move Try Metabolites into Fixed Sets

When it is clear that some biomass metabolites can beproduced, they can be listed in the biomass section, thatis as fixed biomass metabolitesSimilarly for secretions and nutrients, they shouldeventually be listed as fixed metabolites in the .fba file

Page 47: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Iterative Tuning

The preceding suggestions will have to be repeated until asatisfactory set of biomass metabolites is producedIt will require a careful analysis of the suggested reactionsto addTypically, this process requires from several days to severalweeks of work

FBA ModelThe goal of this entire process is to get only fixed sets so that itdescribes a desired FBA model

At some point it might be useful to change the weights toincrease the speed of the solver

Page 48: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Methodology

Checking a Model

A complete FBA model will require proper coefficients forthe biomass reactionThese coefficients could come from in vivo experimentsThe solving of a model will verify that the biomass flux issimilar to in vivo experimentsThe completed model can also be verified against geneknockout experiments

Page 49: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

Suboptimal Solutions

If the SCIP solver terminates due to the time limit, asuboptimal solution has been found.An optimal solution is sometimes needed to have thecorrect reactions to add.The quality of the solution is given by the "gap" percentagegiven in the terminal output alongside all output of theSCIP solver.This gap value must be interpreted based on the weightsused. Typically, we expect a gap of less than 5% to call itgood enough.In the first stages of generating a model, we limit the solverto 5 minutes. At some steps it might useful to increase it:20 minutes, 30 minutes, one hour, or more.

Page 50: MetaFlux in Pathway Tools - SRI International · A .sol file: a summary of the solution found as a text file. ... A .dat file: an omics data file for the Pathway Tools Cellular

Overview of MetaFlux Introduction to Flux Balance Analysis Submitting an FBA Input File Genes and Reactions Knockout Introduction to Development Mode Generating a Model The MILP Formulation Development Mode The Weights and Gap: Fine Control

How To Interpret the Gap

It is simpler to analyze the difference between the last twovalues under the headings "Dualbound" and "Primalbound"of the SCIP outputLet’s assume the gap is not 0%. And the suboptimalsolution suggests three reactions to add.Assume that Primalbound is 319900, Dualbound is320000. The difference is 100.Assuming that all the costs for adding a reaction is 50, itmeans that it is possible that SCIP will find an optimalsolution removing two suggested reactions to addIf you doubt that the suggested reactions to add arenecessary, rerun with more time for the solver