fast-sl: an efficient algorithm to identify synthetic lethals in metabolic networks

35
Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks Karthik Raman Department of Biotechnology Indian Institute of Technology Madras https://home.iitm.ac.in/kraman/lab/ 2015 NNMCB National Meeting December 27, 2015

Upload: karthik-raman

Post on 13-Apr-2017

139 views

Category:

Science


0 download

TRANSCRIPT

Fast-SL: An efficient algorithm to identify syntheticlethals in metabolic networks

Karthik RamanDepartment of Biotechnology

Indian Institute of Technology Madrashttps://home.iitm.ac.in/kraman/lab/

2015 NNMCB National MeetingDecember 27, 2015

Introduction Fast-SL Results Conclusions

Genome-Scale Metabolic Networks (GSMNs)

▶ GSMNs account for the functions of all the known metabolic genesin an organism

▶ Constructed primarily from the genome sequence with annotationsfrom enzyme and pathway databases

▶ 100+ GSMNs are presently available

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 1 / 24

Introduction Fast-SL Results Conclusions

What can GSMNs tell us?McCloskey D et al (2013) Molecular Systems Biology 9:661–661

∆gene

A = 0

B = 6.7

Prokaryotes

A

E

DC

Loss ofredundantpathways

Wild type

A = 3.8

B = 2.9

B

t

OD

orf2

CAATCGACAG

TGATAGCCAG

TTAGTCTGAG

Design

E. coli

B. aphidicola

F

Fluxcoupling

Coupledreaction

sets

Mutualisticgrowth

E. coli

M. barkeri

Nogrowth

E

M

ME

orf1 orf3?

Nogrowth

Growth

Activepathways

TTTT

Model-driven d

iscovery

18 s

tudie

s

7.3

%

Stu

die

s o

f evolu

tionary

pro

cesses

19 s

tudie

s

7.7

%

Metabolic engineering

68 studies

27.4%

Interspecies Interaction

7 studies2.8%

25.8%64 studies

Prediction of cellular phenotypes

29.0%

72 studies

Analysis of biological n

etwork propertie

s

A

B

A

B

E. coli

Reconstruction

248 total studies

E. coli

reconstruction

248 Total studies

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 2 / 24

Introduction Fast-SL Results Conclusions

What can GSMNs tell us?

▶ Predict potential drug targets, by identifying essential and syntheticlethal genes

Editor’s Choice Identification of potential drug targets in Salmonella

enterica sv. Typhimurium using metabolic modellingand experimental validation

Hassan B. Hartman,1 David A. Fell,1 Sergio Rossell,23

Peter Ruhdal Jensen,2 Martin J. Woodward,3 Lotte Thorndahl,4

Lotte Jelsbak,4 John Elmerdahl Olsen,4 Anu Raghunathan,54

Simon Daefler5 and Mark G. Poolman1

Correspondence

Mark G. Poolman

[email protected]

1Department of Medical and Biological Sciences, Oxford Brookes University, Gipsy Lane,Headington, Oxford OX3 OBP, UK

2Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark

3Department of Food and Nutritional Sciences, University of Reading, Reading, UK

4Department of Veterinary Disease Biology, University of Copenhagen, Copenhagen, Denmark

5Department of Infectious Diseases, Mount Sinai School of Medicine, New York, NY, USA

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 3 / 24

Introduction Fast-SL Results Conclusions

What are Synthetic Lethals?

Synthetic lethal gene (or reaction) sets are sets of genes where only thesimultaneous removal of all genes in the set abolishes growth:

Gene abc

Wild-type

Gene pqr

Δabc

Gene abc

Gene pqr

Gene abc

Δpqr

Gene pqr

Gene abc

ΔabcΔpqr

Gene pqr

The concept of synthetic lethality can be extended to higher orders,e.g. triplets

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 4 / 24

Introduction Fast-SL Results Conclusions

Why Identify Synthetic Lethals?

▶ Synthetic lethals find applications in

▶ Understanding gene function and functional associations¹

▶ Combinatorial drug targets against pathogens²

▶ Cancer therapy³

¹Ooi SLL et al (2006) Trends Genet 22:56–63²Hsu KC et al (2013) PLoS Comput Biol 9:e1003127+³Kaelin WG (2005) Nat Rev Cancer 5:689–698

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 5 / 24

Introduction Fast-SL Results Conclusions

How to Identify Synthetic Lethals?

▶ Yeast synthetic lethals have been identified experimentally usingyeast synthetic genetic arrays¹,²

▶ Previous in silico approaches have built on the framework of FluxBalance Analysis — restricted to metabolic genes

¹Tong AHY et al (2001) Science 294:2364–2368²Tong AHY et al (2004) Science 303:808–813

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 6 / 24

Introduction Fast-SL Results Conclusions

What is Flux Balance Analysis?

▶ Effective constraint-based method to study genome-scale metabolicnetworks¹

▶ The mass balance constraints in system of reactions can berepresented by a system of linear equations involving reaction fluxesat steady state

▶ The system is under-determined — so we compute the fluxdistribution that maximises biomass: mathematically, this is a linearprogramming problem

max vbio (the biomass flux)

s.t.

Σjsijvj = 0 ∀i ∈ M (set of metabolites)

LBj ≤ vj ≤ UBj ∀j ∈ J (set of reactions)

¹Varma A & Palsson BO (1994) Applied and Environmental Microbiology 60:3724–3731Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 7 / 24

Introduction Fast-SL Results Conclusions

Geometrical interpretation of FBAOrth JD et al (2010) Nature Biotechnology 28:245–248

×

participating

coefficient

v2

v1

v3

Allowable solution space Optimal solution

v3

Unconstrained solution space

Constraints

1) Sv = 0

2) a i < v i < bi

v2 v2

v1

v3

Optimization

maximize Z

v1

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 8 / 24

Introduction Fast-SL Results Conclusions

Flux Balance Analysis

▶ FBA has been proven to accurately predict phenotypes followingvarious genetic perturbations¹,²

▶ To delete reaction k, set vk = 0 and repeat the simulation:

max vbios.t.

Σjsijvj = 0 ∀i ∈ M

LBj ≤ vj ≤ UBj ∀j ∈ J

vd = 0 d ∈ D ∈ J

▶ FBA can also reliably predict synthetic lethal genes in metabolicnetworks of organisms such as yeast³

¹Edwards JS & Palsson BO (2000) BMC Bioinformatics 1:1²Famili I et al (2003) PNAS 100:13134–13139³Harrison R et al (2007) PNAS 104:2307–2312

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 9 / 24

Introduction Fast-SL Results Conclusions

Identifying Synthetic LethalsBrute Force/Exhaustive Enumeration

▶ Single lethals are easier to identify▶ Solve one optimisation problem for each gene deletion (genotype)

▶ Synthetic lethals are more difficult to identify▶ Combinatorial Explosion▶ e.g.

(10003

)≈ 170 million simulations!

▶ Quickly becomes infeasible for larger organisms …▶ However, simulations are independent and can be easily parallelised

on a computer cluster¹

¹Deutscher D et al (2006) Nature Genetics 38:993–8Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 10 / 24

Introduction Fast-SL Results Conclusions

Identifying Synthetic LethalsBi-Level Mixed Integer Linear Programming Problem

▶ SL-Finder¹ poses the synthetic lethal identification problem elegantlyas a bi-level MILP

▶ Synthetic lethal double and triple reaction deletions have beenreported for E. coli

▶ However, the MILP problems become incrementally difficult to solve▶ Time taken, on a workstation, was ≈ 6.75 days, for E. coli iAF1260

model▶ MCSEnumerator is another MILP-based method, which runs even

faster²

¹Suthers PF et al (2009) Molecular Systems Biology 5:301²von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 11 / 24

Is there a way to surmount thecomplexity of exhaustive

enumeration and bi-level MILP?

Introduction Fast-SL Results Conclusions

An Alternate Approach: Fast-SLPratapa A et al (2015) Bioinformatics 31:3299–3305

▶ Heavily prunes search space for synthetic lethals, and▶ Exhaustively iterates through remaining (much fewer) combinations▶ We successively compute:

▶ Jsl, the set of single lethal reactions,▶ Jdl ⊂ J× J, the set of synthetic lethal reaction pairs, and▶ Jtl ⊂ J3, the set of synthetic lethal reaction triplets

▶ Central idea: We use FBA to compute a flux distribution,corresponding to maximum growth rate, while minimising the sum ofabsolute values of the fluxes, i.e. the ℓ1-norm of the flux vector — the‘minimal norm’ solution of the FBA LP problem

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 12 / 24

Introduction Fast-SL Results Conclusions

Fast-SL: Eliminating Non-Lethal Sets

max vbio (1)

s.t. S.v = 0 (2)

LBj ≤ vj ≤ UBj ∀j ∈ J (3)

▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)

and also sustains maximum growth(1)

(sparse!)▶ The set of reactions that carry a

non-zero flux in this solution is Jnz▶ How does this help?

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24

Introduction Fast-SL Results Conclusions

Fast-SL: Eliminating Non-Lethal Sets

max vbio (1)

s.t. S.v = 0 (2)

LBj ≤ vj ≤ UBj ∀j ∈ J (3)

▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)

and also sustains maximum growth(1)

(sparse!)▶ The set of reactions that carry a

non-zero flux in this solution is Jnz▶ How does this help?

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24

Introduction Fast-SL Results Conclusions

Fast-SL: Eliminating Non-Lethal Sets

max vbio (1)

s.t. S.v = 0 (2)

LBj ≤ vj ≤ UBj ∀j ∈ J (3)

▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)

and also sustains maximum growth(1)

(sparse!)▶ The set of reactions that carry a

non-zero flux in this solution is Jnz▶ How does this help?

����

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24

Introduction Fast-SL Results Conclusions

Fast-SL: Eliminating Non-Lethal Sets

max vbio (1)

s.t. S.v = 0 (2)

LBj ≤ vj ≤ UBj ∀j ∈ J (3)

▶ Identify a flux distribution whichobeys the constraints of FBA(2),(3)

and also sustains maximum growth(1)

(sparse!)▶ The set of reactions that carry a

non-zero flux in this solution is Jnz▶ How does this help?

����

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 13 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

����

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

⇒ There is no single lethal reaction outside Jnz▶ If a pair of reactions i, j carry zero flux in the

minimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

�������

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

⇒ The set of all single lethals (Jsl) is containedentirely in Jnz

▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

�������

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

⇒ The set of all single lethals (Jsl) is containedentirely in Jnz

▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

�������

J

J-Jnz

Jsl

Jnz

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

⇒ The set of all single lethals (Jsl) is containedentirely in Jnz

▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

�������

J

J-Jnz

Jsl

Jnz

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Massively Prunes Search Space for Synthetic Lethals

▶ If a reaction j carries zero flux in the minimalnorm solution (j /∈ Jnz), which is constrainedto support growth, it cannot be lethal

⇒ The set of all single lethals (Jsl) is containedentirely in Jnz

▶ If a pair of reactions i, j carry zero flux in theminimal norm solution (i, j /∈ Jnz), they cannotbe a synthetic lethal pair

⇒ There are no synthetic lethal pairs thatcomprise reactions that are both not in Jnz

▶ All synthetic lethal pairs lie in the narrow ‘redregion’ of J× J (drawn to scale for E. coli)

�������

J

J-Jnz

Jsl

Jnz

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 14 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Achieves Massive Speedups

▶ Even in the narrow red region, further gains aremade by re-applying the idea

▶ The gains are even more substantial for higherorder lethals:

J

J-Jnz

Jsl

Jnz

Order Exhaustive LPs LPs solved aftereliminating non-lethal sets

Reduction insearch-space

Single 2.05 × 103 393 ≈ 5 foldDouble 1.57 × 106 7, 779 ≈ 200 foldTriple 9.27 × 108 432, 487 ≈ 2100 foldQuadruple 4.10 × 1011 4.53 × 107 ≈ 9050 fold

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 15 / 24

Introduction Fast-SL Results Conclusions

Fast-SL: Minimum Norm Solution

▶ Smaller the set of non-zero reactions, Jnz, lesser the number of LPs tobe solved for identifying lethal sets

▶ Minimised ℓ0-norm solution of the FBA LP problem finds thesparsest solution

▶ However, it requires solving an MILP problem

▶ We use the ℓ1-norm solution instead

min. Σj|vj|s.t.

Σjsijvj = 0 ∀i ∈ M

LBj ≤ vj ≤ UBj ∀j ∈ J

vbio = vbio,max

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 16 / 24

Introduction Fast-SL Results Conclusions

Fast-SL Achieves 4x Speedup over MCSEnumerator

▶ Fast-SL can also be parallelised, leading to further speed-ups▶ Fast-SL achieves ≈ 4x speed-up over the MCSEnumerator method¹

for the E. coli iAF120 model for higher order reaction deletions▶ Results obtained using Fast-SL match precisely with exhaustive

enumeration of gene deletions▶ Similar approach can be used to identify lethal gene sets by

incorporating gene–reaction rules

Orderof SLs

No. ofSLs

CPU time taken forMCSEnumerator(using 12 cores)

CPU time taken forFast-SL Algorithm

(using 6 cores)

Speed-up

Single 278 11 s 2.8 s ≈ 8xDouble 96 39.1 s 17.2 s ≈ 4xTriple 247 16.8 min 8.5 min ≈ 4xQuadruple 402 18.5 h 9.3 h ≈ 4x

¹von Kamp A & Klamt S (2014) PLoS Computational Biology 10:e1003378Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 17 / 24

Introduction Fast-SL Results Conclusions

Synthetic Lethal Gene Deletions

▶ Most previous algorithms only computed synthetic reactiondeletions

▶ Not easily modified for computing gene deletions▶ We extended our algorithm to gene deletions by using the

gene–reaction mapping▶ Fast-SL formulation identified 75 new gene triplets in E. coli that

were not identified previously▶ We have also identified up to synthetic lethal gene and reaction

quadruplets for other pathogenic organisms such as SalmonellaTyphimurium, Mycobacterium tuberculosis, Staphylococcus aureusand Neisseria meningitidis

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 18 / 24

Introduction Fast-SL Results Conclusions

Missing Biomass Precursors in E. coli

▶ Gene/reaction lethality is a result of organism’s inability to produceany of the biomass precursors

▶ Most triple and quadruple gene deletions affect mechanismsinvolved in ATP production

0%

10%

20%

30%

40%

50%

Reiterates critical role played by co-factors and ATP in cellular metabolism!Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 19 / 24

Introduction Fast-SL Results Conclusions

Synthetic Lethals Illustrate Complex Metabolic Dependencies

▶ atpB, cydA, gap▶ ATP synthase, cytochrome D ubiquinol oxidase and glyceraldehyde

3-phosphate dehydrogenase▶ Perhaps bring about their effect by disabling both substrate-level and

oxidative phosphorylation

▶ eno, pps, sdhA/B/C▶ Enolase, PEP synthase and succinate dehydrogenase subunits▶ Seem to bring about their effect by affecting production of

phosphoenolpyruvate and consequently disabling OXPHOS

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 20 / 24

Introduction Fast-SL Results Conclusions

Combinatorial Drug Targets

▶ Only few combinatorial deletions abolish growth in silico▶ Re-emphasises the robust nature of the metabolic networks in both

M. tuberculosis and S. Typhimurium▶ 28 triplets and 20 doublets in M. tuberculosis have no homologues in

human▶ 21 triplets and 39 doublets in S. typhimurium have no homologues▶ Some of these may be interesting drug targets

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 21 / 24

Introduction Fast-SL Results Conclusions

Limitations

▶ Metabolic models considered here do not account for regulation orother functions of proteins

▶ The method can identify synthetic lethals only in metabolism▶ Any inadequacies/gaps in the metabolic model will affect the results,

e.g. some isozymes may not have been characterised yet▶ Lethality results can be useful to refine the metabolic model

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 22 / 24

Introduction Fast-SL Results Conclusions

Summary

▶ Synthetic lethals are difficult to identify computationally —combinatorial explosion of possibilities

▶ Previous approaches have used FBA to exhaustively search the entirespace, or pose the problem as a bi-level MILP

▶ Our algorithm, Fast-SL, circumvents the complexities of previousapproaches, through a massive reduction of search space, exploitingthe minimal norm solution of FBA

▶ For E. coli, the reduction in search space is ≈ 4000-fold for syntheticlethal triplets!

▶ Ours is also the first method that systematically evaluates genedeletions

▶ Our results agree exactly with exhaustive enumeration▶ Fast-SL finds application in identifying functional associations and

combinatorial drug targets

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 23 / 24

Introduction Fast-SL Results Conclusions

Acknowledgments

▶ Aditya Pratapa▶ Dr. Shankar Balachandran▶ High Performance Computing Facility IIT Madras▶ Funding: Department of Biotechnology, Government

of India; IIT Madras; nVidia

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 24 / 24

Introduction Fast-SL Results Conclusions

Thank you!

MATLAB implementation of Fast-SL is availablefor download from:

https://github.com/RamanLab/FastSL

Karthik Raman Fast-SL: An efficient algorithm to identify synthetic lethals in metabolic networks 24 / 24