“the instructions for assembling every organism on the planet--slugs and sequoias, peacocks and...

31
“The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA sequences that can be translated into digital information and stored in a computer for analysis. As a consequence of this revolution, biology in the 21st century is rapidly becoming an information science... ...hypotheses will arise as often in silico as in vitro.” Eric Lander, Science 287 (5459), 1777-

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

“The instructions for assembling every organismon the planet--slugs and sequoias, peacocks andparasites, whales and wasps--are all specified in DNAsequences that can be translated into digital information and stored in a computer for analysis. As a consequence of this revolution, biology in the 21st century is rapidly becoming an information science...

...hypotheses will arise as often in silico as in vitro.”

Eric Lander, Science 287 (5459), 1777-1782

Page 2: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA
Page 3: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

The Problem …analysis of native state assembly of proteins.

• Protein function and folding are highly cooperative processes,

• Amino acids that interact in these processes can be close, or relatively distant in the 1o structure,

– identifying interacting residues in active sites, or identifying interacting residues that yield discrete 3o structure is difficult,

– these interactions are not obvious by scanning primary sequence.

Page 4: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

A Partial Solution

• Mutational analysis,

– clone the gene,

– alter the DNA sequence that codes for specific residues,

– express the gene,

– check for function or conformational fidelity.

Labor intensive. Doesn’t indicate residue interactions.

Page 5: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

A Better SolutionThermodynamic Mutant Cycling Analysis

More later…but briefly…

Double mutation analysis,

– used to determine if two different residues (or peptide fragments) interact.

Labor intensive.

Presently impossible to accomplish on a large scale.

Page 6: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

A Bioinformatic Alternative…...let Evolution do the dirty work.

Multiple Sequence Alignment (globins)

Page 7: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

“Entropy” in a MSA…the key to this paper.

• Think of amino acids as parts of a system that follows the rules of thermodynamics,

– if there were no constraints, amino acid frequency and distribution would tend to randomness,

– however, natural selection constrains primary sequence in living systems.

Page 8: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

MSA and Entropy

Page 9: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Genomic SequencesDNA Sequence:Reagent for the 21st Century

Page 10: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

PDZ Domains (n = 274)

“Model” Protein Domain Family

• Evolutionarily conserved, especially in tertiary structure,

– C atoms: root mean square deviation = 1.4 angstroms*,

• More diverged in sequence homology,

– averaging 24% AA sequence similarity.

• *Four high resolution crystal structures of distantly related family members.

Post synaptic density protein (PSD95), Drosophila disc large tumor suppressor (DlgA), and Zonula occludens-1 protein (zo-1)

Page 11: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Structural Classification of Proteinsdomains

Google: SCOP

Pfam

PDZ domains are found in diverse signaling proteins in bacteria, yeasts, plants, insects and vertebrates. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences

PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (betaA to betaF) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an antiparallel beta-strand interacts with the betaB strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the betaA and betaB strands.

Page 12: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Don’t Sweat the Formulas!

…English Translation: a measure of conservation can be made by comparing the frequency of amino acids in the column of a MSA, to a randomly filled column…

…expressed as a change in free energy.

Page 13: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 1A

Black: amino acid frequency in a database of 36,498 proteins.

Gray: amino acid frequency in a database of 274 PDZ domains.

Page 14: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

PDZ domain AA 76

• AA 76 is known to be important in determining ligand specificity,

- S/T - X- V/I - COO- - or - - F/Y - X- V/A - COO-

Antepenultimate AA in the ligand.

Page 15: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 1B,CPDZ MSA Gstat

Highly conserved.

Poorly conserved.

 Gstat = 3.83 kT*

 Gstat = 0.1 kT*

Page 16: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 1D

76

99

Page 17: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 1E, F

Page 18: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Coupled Sites?

…English Translation: you change the MSA by removing a subset of peptides that have similar (or identical) amino acids in a specific column…

…if the amino acid in the original column interacts with another part of the peptide, you might expect to see a change in Gstat (Gstat ) in another column of the new MSA.

Page 19: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Perturbing the MSA…extract subsets of low-entropy alignments.

Re-calculate Gstat in the new MSA, look

for columns that had a change in

Gstat.

Page 20: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

AA 76

– removed all of the peptides that had a histidine at AA 76 in the MSA,

• Calculated the change in Gstat (Gstat) at all positions.

Page 21: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 2 B AA 76

AA 34

AA 63

Page 22: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 2C-F

33, 34, 80, 84local

29, 26other side of ligand binding

66, 57, 51unexpected

Page 23: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

in Silica, So Far, So What?

Show me the money...

Page 24: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Fig. 3

Statistical Gstat

Experimental Gstat

H76Y

Page 25: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

FRET Förster Resonance Energy Transfer

Page 26: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Mutant Cycling Analysis (General)…with FRET (Förster Resonance Energy Transfer)

ratio m1

ratio m1:m2

If not equal, then sites are coupled.

Please Note: this was a general presentation, see slide 31 for the manner used in this paper.

Page 27: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Fig. 3

?

Fig. D: What is it, why is it included?

Page 28: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 4

Attempt to map connectivity through the peptide.

Also performed analysis on POZ domain.

Page 29: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Conclusion

With growing sequence data from evolutionary distant genomes, the mapping

of energetic connectivity for many fold families should be a realistic goal.

Page 30: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

Figure 4

Page 31: “The instructions for assembling every organism on the planet--slugs and sequoias, peacocks and parasites, whales and wasps--are all specified in DNA

k (wt:wt) x k (mut:mut)

k (wt:mut) x k (mut:wt)

Coupling Coefficient(Mutant Cycling Analysis)

coupling coefficient =

...if there is no coupling, then the coupling coefficient would approach unity.