a genome-wide perspective on translation of proteins
DESCRIPTION
A genome-wide perspective on translation of proteins. Dec 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel. Teaching assistant: Idan Frumkin. [email protected] Submit Sunday at midnight. The Central Dogma of Molecular Biology Expressing the genome. RNA. Inactive DNA. - PowerPoint PPT PresentationTRANSCRIPT
A genome-wide perspective on translation of proteins
Dec 2012Regulatory GenomicsLecturer: Prof. Yitzhak Pilpel
Teaching assistant: Idan Frumkin
Submit Sunday at midnight
The Central Dogma of Molecular BiologyExpressing the genome
DNA mRNA Protein
f f
Inactive DNA
RNA
http://esg-www.mit.edu:8001/esgbio/pge/lac.html
In the presence of Lactose
The Lac Operon (Jacob and Monod)
4
Catabolism (breakdown of molecules, e.g.
lactose)
Anabolism (synthesis of
molecules, e.g. amino acids)
Gene is ON when substrate is present
Gene is OFF when substrate is absent
Gene is ON when substrate is absent
Gene is OFF when substrate is present
The basic logic of metabolic control
A combined transcription -translation control switch
At the Attenuation mechanism
Charles Yanofsky
The trp operon in e. coli
A negative control at the transcription level (similar and different from the lac operon)
How not to make too much triptophene?
• A fail safe mechanism complements transcription control
• At the translation level!
The up-stream ORF structure of the trp operon
An uORF
Mutual palindromes
1-2 are complementary
2-3 are complementary
3-4 are complementary
The various palindromic pairings
1-2, and 3-42-3
Transcription terminator!
Not aterminator!
High Trp Low Trp
The structure of the Attenuation switch
RibosomeRibosome
RNA pol
RNA pol
Could that be implemented in eukaryotes as well?
• No! because requires co transcription-translation
Where does translation take place?
Spatial organization of the flow of geneticinformation in bacteria (Llopis Nature 2010)
DN
A
=DNA=mRNA=Protein
Translation consists of initiation, elongation and termination
5’ 3’STOP
Codon
Anti-codon
The dynamics of translation
The ribosome reads nucleotide sequence and produces amino acid sequence based on the
genetic codeSome important properties of the code• The code is (almost)
universal• There are 61 amino acid
codons, and 3 STOP codons
• The code is “redundant” - many amino acids have more than one codon
• The genetic code is optimal wrt to many properties, such as error tolerance
The tRNAThe generic form A specific form In 3D
Aminoacyl tRNA synthetase:The really “smart” part
20 amino acids, 61 codons, 20 Aminoacyl tRNA synthetases
Error rate: 1/10,000-1/100,000(in-vitro; higher in-vivo)
The 20 canonical amino acids
Possible mechanisms of translational regulation
• optimality of ribosomal attachment site• mRNA secondary structure• codon usage
Multiple codons for the same amino acid
C1 C2 C3 C4 C5 C6Serine: UCU UCC UCA UCG AGC AGUCysteine: UGU UGCMethionine: UGG
STOP: UAA, UAG UGA
G T R Y E C Q A S F DC1C1C1C1C1C1C1C1C1C1C1C2C2C2C2C2C2C2C2C2C2C2C1C1C2C1C1C2C1C1C2C1C1C2C2C2C2C1C1C1C1C1C1C1C1C1C1C1C1C1C1C2C2C2C2
For a hypothetical protein of 300 amino acids with two-codon each, There are 2^300 possible nucleotide sequences
These variants will code for the same protein, and are thus considered “synonymous”.
Indeed evolution would easily exchange between themBut are they all really equivalent??
The codon bias in genomes
Two potential types of sources for codon bias
Mutation pattern(neutral)
Selection
Codon bias
The effect of (or on?) GC content
Nucleotide composition
Codon bias
Coding CodingInter-genic
Inter-genic composition (esp in bacteria) explain codon bias
Mutation pressure
Selection Amino acidcomposition
Selection of codons might affect:AccuracyThroughput
CostsFolding
RNA-structure
AAA CCA GAA UCG AAG … ……
A simple model for translation efficiency
8 2 5 4 1 Average: 4AA Codon AmountLys AAA 8 Asp AAC 6Lys AAG 1Asp AAUThr ACAThr ACC..Phe UUU
5’ 3’
The same protein can be encoded in many ways…
amino acid sequence: MPKSNFRFGE
ATG
ATGCCT
ATGCCC
ATGCCA
ATGCCG
most efficient
least efficient
intermediate efficiency
intermediate efficiency
relative concentration of tRNA in the cell
1
0
5
0
Scoring coding sequences for efficiency in translation
ATC CCA AAA TCG AAT
coding sequence translation efficiency score( (geometric) average of all tRNA gene copy numbers)
… ………
Efficient intermediate non-efficient
10 10 7 2 6tRNAGene copies
(dos Reis et al. Nucleic Acids Res, 2004)
in
jijiji tRNAsW
1
)1(
Wi/Wmax if Wi0wi = wmean else{
tAIg wikk1
g
1/g
dos Reis et al. NAR 2004
The tRNA Adaptation Index (tAI)
ATC CCA AAA TCG AAT … ……
A simple model for translation efficiency
Wobble Interaction
Correlation of tAI with experimentally determined protein levelsr=0.63
Predicted translation efficiency
Mea
sure
d pr
otei
n ab
unda
nce
(Ghaemmaghami et al. Nature 2003)
The correlation is quite high, but why not even higher?
• The limitations of the model• tRNA gene copy numbers • Model only capture elongation• Difference in mRNA levels• Protein are also degraded at different rates
gg
kk
ig wtAI /1
1
The effective number of codons (Nc) - a measure of overall synonymous codon usage bias
AA...
GlyGlyGlyGly
.
.
.
codon...
GGTGGCGGAGGG
.
.
.
Codon count...0
1200...
Highly biased synonymous codon usage (Nc=20)
Gene1AA...
GlyGlyGlyGly
.
.
.
codon...
GGTGGCGGAGGG
.
.
.
Codon count...3333...
No bias in synonymous codon usage (Nc≥61)
Gene2
Wright, F. (1990). "The 'effective number of codons' used in a gene." Gene 87(1): 23-9.
Codon usage bias is correlated with translation efficiency
r=-0.79 (p<0.001)
Mutation pattern(neutral)
Selection
Codon bias
But not in all species(e.g. A. gossypii)
r=-0.48 (p=0.218)
Mutation pattern(neutral)
Selection
Codon bias
S. cerevisiae S. bayanus C. glabrata A. gossypii D. hansenii C. albicans Y. lipolytica S. pombe
r -0.79 -0.73 -0.79 -0.48 -0.75 -0.65 -0.84 -0.66
p <0.001 <0.001 <0.001 0.218 <0.001 0.005 <0.001 <0.001
Translation selection acts in some but not all species (e.g. debate on human…)
Correlation does not imply causality!!
r=0.63
Predicted translation efficiency
Mea
sure
d pr
otei
n ab
unda
nce
(Ghaemmaghami et al. Nature 2003)
Evolutionary
Physiological
Z