protein targeting by functional linkage of non-homologous proteins with examples from m....

27
Functional Linkage of Non- Homologous Proteins with examples from M. tuberculosis TB G ene B 0 1000 2000 3000 4000 TBGeneA 0 1000 2000 3000 4000 Genome-wide functional linkage map ctural Genomics Complexes: tifying subunits omplexes by analyzing volution of non- logous proteins, from me-wide functional age maps

Upload: kelly-elfreda-lester

Post on 18-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Protein Targeting by Functional Linkage of Non-Homologous Proteins

with examples from M. tuberculosis

TB Gene B0 1000 2000 3000 4000

TB G

ene A

0

1000

2000

3000

4000

Genome-wide functional linkage mapStructural Genomics of Complexes:

Identifying subunitsof complexes by analyzingco-evolution of non-homologous proteins, fromgenome-wide functionallinkage maps

Page 2: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Limitations of Relying Entirely on Homology-Based Targeting

• Many (most ?) proteins function in complexes made up of non-homologous proteins

• Some (many ?) proteins are crystallizable only with their functional partners

Page 3: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Limitations of Relying Entirely on Homology-Based Targeting

• Many (most ?) proteins function in complexes made up of non-homologous proteins

• Some (many ?) proteins are crystallizable only with their functional partners

Suggests that targeting of non-homologus, functionallylinked proteins may offer a useful shortcut to learning protein structures and functions

Page 4: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Identifying Subunits of Protein Complexes by Analyzing the

Co-evolution of Non-homologous Proteins

Structural Genomics of Protein Complexes

Page 5: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

4 Methods to Infer Non-Homologous Protein Pairs that have Co-evolved and

hence are Functionally Linked

•Rosetta Stone Protein fusion

•Phylogenetic Profile Protein co-occurrrence

•Gene neighbor Constant separation

•Operon Small separation

A

A

A′

B

B′

Page 6: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Figure 7. M. Strong, T. Graeber et al.

Page 7: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Whole Genome Functional Linkage Map (RS, PP, GN, OP methods for TB)

TB Gene B0 1000 2000 3000 4000

TB G

ene

A

0

1000

2000

3000

4000

Classical graphical representation of protein functional linkages

Research of Michael Strong and Morgan Beeby

Requiring 2 or more functional linkages:1,865 genes make 9,766 linkages

Functional Linkages Between Genes of M. tuberculosis

Page 8: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Hierarchical Clustering of the Combined Genome-Wide Linkage Map for M. Tb. Reveals Complexes and

Pathways

TB Gene B

0 1000 2000 3000 4000 5000

TB G

ene

A

0

1000

2000

3000

4000

5000

Genome-wide functional linkagemap based on 4 methods:

Clustered linkage mapshowing complexes and pathways:

Clustersimilarlinkagepatterns

ach cluster is a complex or pathway

Page 9: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

DetoxificationPolyketide and non-ribosomal peptide synthesis

Energy Metabolism, oxidoreductase

Deg. of Fatty AcidsVirulenceEnergy Metabolism, oxidoreductase Amino acid Biosynthesis

Emergy Metab. Respiration AerobicLipid Biosynthesis

Degradation of Fatty Acids

Amino Acid Biosynthesis (Branched)

Synthesis and Modif. Of Macromolecules, rpl,rpm, rpsBiosynthesis of Cofactors, Prosthetic groups

Purine, Pyrimidine nucleotide biosynthesisNovel Group Sugar MetabolismAromatic Amino Acid BiosynthesisEnergy Metabolism, Anaerobic Respiration

Two component systemsCell EnvelopeCytochrome P450Chaperones

Biosynthesis of cofactors

Cell Envelope, Cell Division

Transport/Binding Proteins

Energy Metabolism TCA

Broad Regulatory, Serine Threonine Protein Kinase

Cell Envelope, Murein Sacculus and Peptidoglycan

Transport/Binding Proteins Cations

Energy Metabolism, ATP Proton Motive force

Fig 4.M. Strong, T. Graeber et al.

Page 10: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Quantitative Assessment of Inferred Protein Complexes

Page 11: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Calculating Probabilities of Co-evolution

m

Nkm

nN

k

n

NmnkP ),,|(

1

0 !

ln)(1)(

m

k

k

mm k

XXXPXP

nenP 1)(

Phylogenetic ProfileRosetta Stone

Gene Neighbor

Operon

N= number of fully sequenced genomesn= number of homologs of protein Am = number of homologs of protein Bk = number of genomes shared in common

X= fractional separation of genes

n = intergenic separation

Page 12: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Combining Inferences of Co-Evolution from 4 Methods

We use a Bayesian approach to combine the probabilities from the four methods to arrive at a single probability that two proteins co-evolve:

)(

)(

)|(

)|(4

1 negP

posP

negfP

posfPO

i i

ipost

where positive pairs are proteins with common pathway annotation and negative pairs are proteins with different annotation

Page 13: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Benchmarking this Approach Against Known Complexes

Ecocyc: Karp et al. NAR, 30, 56 (2002)

True positive interactions are between subunits of known complexes and false positive ones are between subunits of different complexes.

ROC plot

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009

Fraction of False Positives

Fra

ctio

n o

f T

rue

Po

siti

ves

For high confidence links, we find 1/3 of true interactions with only one 1/1000 of the false positive ones

Random

Page 14: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Example Complex: NADH Dehydrogenase I

11 of 13 subunits detected

Page 15: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Example Complex: NADH Dehydrogenase I

11 of 13 subunits detected

3 false positives

Page 16: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

CtaD

CtaE CtaC

Functional Linkages Among Cytochrome Oxidase Genes

CtaBFunctional linkages relate all 3 componentsof cytochrome oxidase complexand also CtaB, the cytochrome oxidase assembly factor

These genes are at four different chromosomallocations

Membrane proteins linked to soluble proteins

Page 17: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

From Inferred Protein Complexes to their

Structures

Page 18: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

PE, PE-PGRS, and PPE Proteins in M. tuberculosis

38 PE proteins; 61 PE-PGRS proteins; 68 PPE proteins

Together compromise about 5 % of the genome

No function is known, but some appear to be membrane boundNo structure is known: always insoluble when expressed

Goal: use functional linkages to predict a complex betweena PE and a PPE protein: express complex, and determineits structure

Research of Shuishu Wang and Michael Strong

The Problem of PE and PPE Proteins in M. tb

Page 19: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Construction of a co-expression vector to test for protein-protein interactions (Mike Strong)

pET 29b(+)

T7 promoter lac oper. RBS

Nde1 HindIIIKpn1 NcoI

RBS gene A gene B

Thrombinsite

His tag

polycistronic mRNA

transcription

translation

protein A protein B (with His tag)

protein A protein B (with His tag) protein A protein B (with His tag)

If proteins interact (protein-protein interaction)

If proteins do not interact

Page 20: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

When co-expressed, the PE and PPE proteins, inferred to interact, do form a soluble complex,

Mr = 35,200Sedimentation equilibrium experiments:Rv2430c + Rv2431c fraction 49, in 20mM HEPES, 150mM NaCl, pH 7.8Concentration OD280 0.7, 0.45, 0.15

Expected Mr:

Rv 2431c (PE) 10,687

(10563.12 from Mass Spec)

Rv2430c+His tag (PPE) 24,072

(23895.00 from Mass Spec)

Possibly suggests a 1:1 complex between these

two proteins

Page 21: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Crystallization trials of the Complex Between PE Protein Rv2430c and PPE Protein Rv2431c

Page 22: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Summary

Many functional lnkages are revealed from genomic data (high coverage)

Page 23: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Summary

Many functional lnkages are revealed from genomic data (high coverage)

Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

Page 24: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Summary

Many functional lnkages are revealed from genomic data (high coverage)

Known subunits of E. coli complexes can be identified with high accuracy from functional linkages

Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

Page 25: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Summary

Many functional lnkages are revealed from genomic data (high coverage)

Known subunits of E. coli complexes can be identified with high accuracy from functional linkages

Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

A protein complex suitable for structural studieshas been revealed from functional linkages

Page 26: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Summary

Many functional lnkages are revealed from genomic data (high coverage)

Known subunits of E. coli complexes can be identified with high accuracy from functional linkages

Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

A protein complex suitable for structural studieshas been revealed from functional linkages

The procedures for identifying and producing protein complexes can be adapted for high thruput

Page 27: Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics

Protein Interactions in M. tb.Analysis of M.tb. Genome

Michael Strong, Debnath Pal,Sulmin Kim

Whole Genome Interaction MapsMichael Strong, Tom Graeber,Huiying Li, Matteo Pellegrini

Methods of Inferring InteractionsEdward Marcotte, Matteo Pellegrini,Todd Yeates, Michael Thompson

PI of Tb Structural Genomics ConsortiumTom Terwilliger