fast-sl: an efficient algorithm to identify synthetic lethals in metabolic networks
TRANSCRIPT
Fast-SL: An efficient algorithm to identify syntheticlethals in metabolic networks
Karthik RamanDepartment of Biotechnology
Indian Institute of Technology Madrashttps://home.iitm.ac.in/kraman/lab/
2015 NNMCB National MeetingDecember 27, 2015
Introduction Fast-SL Results Conclusions
Genome-Scale Metabolic Networks (GSMNs)
▶ GSMNs account for the functions of all the known metabolic genesin an organism
▶ Constructed primarily from the genome sequence with annotationsfrom enzyme and pathway databases
▶ 100+ GSMNs are presently available
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 1 / 24
Introduction Fast-SL Results Conclusions
What can GSMNs tell us?McCloskey D et al (2013) Molecular Systems Biology 9:661–661
∆gene
A = 0
B = 6.7
Prokaryotes
A
E
DC
Loss ofredundantpathways
Wild type
A = 3.8
B = 2.9
B
t
OD
orf2
CAATCGACAG
TGATAGCCAG
TTAGTCTGAG
Design
E. coli
B. aphidicola
F
Fluxcoupling
Coupledreaction
sets
Mutualisticgrowth
E. coli
M. barkeri
Nogrowth
E
M
ME
orf1 orf3?
Nogrowth
Growth
Activepathways
TTTT
Model-driven d
iscovery
18 s
tudie
s
7.3
%
Stu
die
s o
f evolu
tionary
pro
cesses
19 s
tudie
s
7.7
%
Metabolic engineering
68 studies
27.4%
Interspecies Interaction
7 studies2.8%
25.8%64 studies
Prediction of cellular phenotypes
29.0%
72 studies
Analysis of biological n
etwork propertie
s
A
B
A
B
E. coli
Reconstruction
248 total studies
E. coli
reconstruction
248 Total studies
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 2 / 24
Introduction Fast-SL Results Conclusions
What can GSMNs tell us?
▶ Predict potential drug targets, by identifying essential and syntheticlethal genes
Editor’s Choice Identification of potential drug targets in Salmonella
enterica sv. Typhimurium using metabolic modellingand experimental validation
Hassan B. Hartman,1 David A. Fell,1 Sergio Rossell,23
Peter Ruhdal Jensen,2 Martin J. Woodward,3 Lotte Thorndahl,4
Lotte Jelsbak,4 John Elmerdahl Olsen,4 Anu Raghunathan,54
Simon Daefler5 and Mark G. Poolman1
Correspondence
Mark G. Poolman
1Department of Medical and Biological Sciences, Oxford Brookes University, Gipsy Lane,Headington, Oxford OX3 OBP, UK
2Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
3Department of Food and Nutritional Sciences, University of Reading, Reading, UK
4Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark
5Department of Infectious Diseases, Mount Sinai School of Medicine, New York, NY, USA
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 3 / 24
Introduction Fast-SL Results Conclusions
What are Synthetic Lethals?
Synthetic lethal gene (or reaction) sets are sets of genes where only thesimultaneous removal of all genes in the set abolishes growth:
Gene abc
Wild-type
Gene pqr
Δabc
Gene abc
Gene pqr
Gene abc
Δpqr
Gene pqr
Gene abc
ΔabcΔpqr
Gene pqr
The concept of synthetic lethality can be extended to higher orders,e.g. triplets
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 4 / 24
Introduction Fast-SL Results Conclusions
Why Identify Synthetic Lethals?
▶ Synthetic lethals find applications in
▶ Understanding gene function and functional associations¹
▶ Combinatorial drug targets against pathogens²
▶ Cancer therapy³
¹Ooi SLL et al (2006) Trends Genet 22:56–63²Hsu KC et al (2013) PLoS Comput Biol 9:e1003127+³Kaelin WG (2005) Nat Rev Cancer 5:689–698
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 5 / 24
Introduction Fast-SL Results Conclusions
How to Identify Synthetic Lethals?
▶ Yeast synthetic lethals have been identified experimentally usingyeast synthetic genetic arrays¹,²
▶ Previous in silico approaches have built on the framework of FluxBalance Analysis — restricted to metabolic genes
¹Tong AHY et al (2001) Science 294:2364–2368²Tong AHY et al (2004) Science 303:808–813
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 6 / 24
Introduction Fast-SL Results Conclusions
What is Flux Balance Analysis?
▶ Effective constraint-based method to study genome-scale metabolicnetworks¹
▶ The mass balance constraints in system of reactions can berepresented by a system of linear equations involving reaction fluxesat steady state
▶ The system is under-determined — so we compute the fluxdistribution that maximises biomass: mathematically, this is a linearprogramming problem
max vbio (the biomass flux)
s.t.
Σjsijvj = 0 ∀i ∈ M (set of metabolites)
LBj ≤ vj ≤ UBj ∀j ∈ J (set of reactions)
¹Varma A & Palsson BO (1994) Applied and Environmental Microbiology 60:3724–3731Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 7 / 24
Introduction Fast-SL Results Conclusions
Geometrical interpretation of FBAOrth JD et al (2010) Nature Biotechnology 28:245–248
×
participating
coefficient
v2
v1
v3
Allowable solution space Optimal solution
v3
Unconstrained solution space
Constraints
1) Sv = 0
2) a i < v i < bi
v2 v2
v1
v3
Optimization
maximize Z
v1
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 8 / 24
Introduction Fast-SL Results Conclusions
Flux Balance Analysis
▶ FBA has been proven to accurately predict phenotypes followingvarious genetic perturbations¹,²
▶ To delete reaction k, set vk = 0 and repeat the simulation:
max vbios.t.
Σjsijvj = 0 ∀i ∈ M
LBj ≤ vj ≤ UBj ∀j ∈ J
vd = 0 d ∈ D ∈ J
▶ FBA can also reliably predict synthetic lethal genes in metabolicnetworks of organisms such as yeast³
¹Edwards JS & Palsson BO (2000) BMC Bioinformatics 1:1²Famili I et al (2003) PNAS 100:13134–13139³Harrison R et al (2007) PNAS 104:2307–2312
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 9 / 24
Introduction Fast-SL Results Conclusions
Identifying Synthetic LethalsBrute Force/Exhaustive Enumeration
▶ Single lethals are easier to identify▶ Solve one optimisation problem for each gene deletion (genotype)
▶ Synthetic lethals are more difficult to identify▶ Combinatorial Explosion▶ e.g.
(10003
)≈ 170 million simulations!
▶ Quickly becomes infeasible for larger organisms …▶ However, simulations are independent and can be easily parallelised
on a computer cluster¹
¹Deutscher D et al (2006) Nature Genetics 38:993–8Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 10 / 24
Introduction Fast-SL Results Conclusions
Identifying Synthetic LethalsBi-Level Mixed Integer Linear Programming Problem
▶ SL-Finder¹ poses the synthetic lethal identification problem elegantlyas a bi-level MILP
▶ Synthetic lethal double and triple reaction deletions have beenreported for E. coli
▶ However, the MILP problems become incrementally difficult to solve▶ Time taken, on a workstation, was ≈ 6.75 days, for E. coli iAF1260
model▶ MCSEnumerator is another MILP-based method, which runs even
faster²
¹Suthers PF et al (2009) Molecular Systems Biology 5:301²von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 11 / 24
Introduction Fast-SL Results Conclusions
An Alternate Approach: Fast-SLPratapa A et al (2015) Bioinformatics 31:3299–3305
▶ Heavily prunes search space for synthetic lethals, and▶ Exhaustively iterates through remaining (much fewer) combinations▶ We successively compute:
▶ Jsl, the set of single lethal reactions,▶ Jdl ⊂ J× J, the set of synthetic lethal reaction pairs, and▶ Jtl ⊂ J3, the set of synthetic lethal reaction triplets
▶ Central idea: We use FBA to compute a flux distribution,corresponding to maximum growth rate, while minimising the sum ofabsolute values of the fluxes, i.e. the ℓ1-norm of the flux vector — the‘minimal norm’ solution of the FBA LP problem
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 12 / 24
Introduction Fast-SL Results Conclusions
Fast-SL: Eliminating Non-Lethal Sets
max vbio (1)
s.t. S.v = 0 (2)
LBj ≤ vj ≤ UBj ∀j ∈ J (3)
▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)
and also sustains maximum growth(1)
(sparse!)▶ The set of reactions that carry a
non-zero flux in this solution is Jnz▶ How does this help?
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24
Introduction Fast-SL Results Conclusions
Fast-SL: Eliminating Non-Lethal Sets
max vbio (1)
s.t. S.v = 0 (2)
LBj ≤ vj ≤ UBj ∀j ∈ J (3)
▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)
and also sustains maximum growth(1)
(sparse!)▶ The set of reactions that carry a
non-zero flux in this solution is Jnz▶ How does this help?
�
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24
Introduction Fast-SL Results Conclusions
Fast-SL: Eliminating Non-Lethal Sets
max vbio (1)
s.t. S.v = 0 (2)
LBj ≤ vj ≤ UBj ∀j ∈ J (3)
▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)
and also sustains maximum growth(1)
(sparse!)▶ The set of reactions that carry a
non-zero flux in this solution is Jnz▶ How does this help?
����
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24
Introduction Fast-SL Results Conclusions
Fast-SL: Eliminating Non-Lethal Sets
max vbio (1)
s.t. S.v = 0 (2)
LBj ≤ vj ≤ UBj ∀j ∈ J (3)
▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)
and also sustains maximum growth(1)
(sparse!)▶ The set of reactions that carry a
non-zero flux in this solution is Jnz▶ How does this help?
����
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
����
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
⇒ There is no single lethal reaction outside Jnz▶ If a pair of reactions i, j carry zero flux in the
minimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
�������
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
⇒ The set of all single lethals (Jsl) is containedentirely in Jnz
▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
�������
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
⇒ The set of all single lethals (Jsl) is containedentirely in Jnz
▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
�������
J
J-Jnz
Jsl
Jnz
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
⇒ The set of all single lethals (Jsl) is containedentirely in Jnz
▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
�������
J
J-Jnz
Jsl
Jnz
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Massively Prunes Search Space for Synthetic Lethals
▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal
⇒ The set of all single lethals (Jsl) is containedentirely in Jnz
▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair
⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz
▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)
�������
J
J-Jnz
Jsl
Jnz
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Achieves Massive Speedups
▶ Even in the narrow red region, further gains aremade by re-applying the idea
▶ The gains are even more substantial for higherorder lethals:
J
J-Jnz
Jsl
Jnz
Order Exhaustive LPs LPs solved aftereliminating non-lethal sets
Reduction insearch-space
Single 2.05 × 103 393 ≈ 5 foldDouble 1.57 × 106 7, 779 ≈ 200 foldTriple 9.27 × 108 432, 487 ≈ 2100 foldQuadruple 4.10 × 1011 4.53 × 107 ≈ 9050 fold
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 15 / 24
Introduction Fast-SL Results Conclusions
Fast-SL: Minimum Norm Solution
▶ Smaller the set of non-zero reactions, Jnz, lesser the number of LPs tobe solved for identifying lethal sets
▶ Minimised ℓ0-norm solution of the FBA LP problem finds thesparsest solution
▶ However, it requires solving an MILP problem
▶ We use the ℓ1-norm solution instead
min. Σj|vj|s.t.
Σjsijvj = 0 ∀i ∈ M
LBj ≤ vj ≤ UBj ∀j ∈ J
vbio = vbio,max
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 16 / 24
Introduction Fast-SL Results Conclusions
Fast-SL Achieves 4x Speedup over MCSEnumerator
▶ Fast-SL can also be parallelised, leading to further speed-ups▶ Fast-SL achieves ≈ 4x speed-up over the MCSEnumerator method¹
for the E. coli iAF120 model for higher order reaction deletions▶ Results obtained using Fast-SL match precisely with exhaustive
enumeration of gene deletions▶ Similar approach can be used to identify lethal gene sets by
incorporating gene–reaction rules
Orderof SLs
No. ofSLs
CPU time taken forMCSEnumerator(using 12 cores)
CPU time taken forFast-SL Algorithm
(using 6 cores)
Speed-up
Single 278 11 s 2.8 s ≈ 8xDouble 96 39.1 s 17.2 s ≈ 4xTriple 247 16.8 min 8.5 min ≈ 4xQuadruple 402 18.5 h 9.3 h ≈ 4x
¹von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 17 / 24
Introduction Fast-SL Results Conclusions
Synthetic Lethal Gene Deletions
▶ Most previous algorithms only computed synthetic reactiondeletions
▶ Not easily modified for computing gene deletions▶ We extended our algorithm to gene deletions by using the
gene–reaction mapping▶ Fast-SL formulation identified 75 new gene triplets in E. coli that
were not identified previously▶ We have also identified up to synthetic lethal gene and reaction
quadruplets for other pathogenic organisms such as SalmonellaTyphimurium, Mycobacterium tuberculosis, Staphylococcus aureusand Neisseria meningitidis
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 18 / 24
Introduction Fast-SL Results Conclusions
Missing Biomass Precursors in E. coli
▶ Gene/reaction lethality is a result of organism’s inability to produceany of the biomass precursors
▶ Most triple and quadruple gene deletions affect mechanismsinvolved in ATP production
0%
10%
20%
30%
40%
50%
Reiterates critical role played by co-factors and ATP in cellular metabolism!Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 19 / 24
Introduction Fast-SL Results Conclusions
Synthetic Lethals Illustrate Complex Metabolic Dependencies
▶ atpB, cydA, gap▶ ATP synthase, cytochrome D ubiquinol oxidase and glyceraldehyde
3-phosphate dehydrogenase▶ Perhaps bring about their effect by disabling both substrate-level and
oxidative phosphorylation
▶ eno, pps, sdhA/B/C▶ Enolase, PEP synthase and succinate dehydrogenase subunits▶ Seem to bring about their effect by affecting production of
phosphoenolpyruvate and consequently disabling OXPHOS
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 20 / 24
Introduction Fast-SL Results Conclusions
Combinatorial Drug Targets
▶ Only few combinatorial deletions abolish growth in silico▶ Re-emphasises the robust nature of the metabolic networks in both
M. tuberculosis and S. Typhimurium▶ 28 triplets and 20 doublets in M. tuberculosis have no homologues in
human▶ 21 triplets and 39 doublets in S. typhimurium have no homologues▶ Some of these may be interesting drug targets
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 21 / 24
Introduction Fast-SL Results Conclusions
Limitations
▶ Metabolic models considered here do not account for regulation orother functions of proteins
▶ The method can identify synthetic lethals only in metabolism▶ Any inadequacies/gaps in the metabolic model will affect the results,
e.g. some isozymes may not have been characterised yet▶ Lethality results can be useful to refine the metabolic model
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 22 / 24
Introduction Fast-SL Results Conclusions
Summary
▶ Synthetic lethals are difficult to identify computationally —combinatorial explosion of possibilities
▶ Previous approaches have used FBA to exhaustively search the entirespace, or pose the problem as a bi-level MILP
▶ Our algorithm, Fast-SL, circumvents the complexities of previousapproaches, through a massive reduction of search space, exploitingthe minimal norm solution of FBA
▶ For E. coli, the reduction in search space is ≈ 4000-fold for syntheticlethal triplets!
▶ Ours is also the first method that systematically evaluates genedeletions
▶ Our results agree exactly with exhaustive enumeration▶ Fast-SL finds application in identifying functional associations and
combinatorial drug targets
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 23 / 24
Introduction Fast-SL Results Conclusions
Acknowledgments
▶ Aditya Pratapa▶ Dr. Shankar Balachandran▶ High Performance Computing Facility IIT Madras▶ Funding: Department of Biotechnology, Government
of India; IIT Madras; nVidia
Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 24 / 24