rna structure prediction rna structure basics the rna ‘rules’ programs and predictions bio520...
TRANSCRIPT
RNA Structure Prediction
RNA Structure Basics
The RNA ‘Rules’
Programs and Predictions
BIO520 Bioinformatics Jim Lund
Assigned reading: Ch. 6 from Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3rd Ed. by Baxevanis and Ouellette.
RNA classes
• mRNA - messenger RNA.• tRNA - transfer RNA, small (~80 bases) sequences
which bring amino acids to the ribosome.• rRNA - ribosomal RNA, RNA + proteins =
ribosome.• viral RNA (ssRNA, dsRNA virii)• miRNA: translational/transcriptional gene silencing.• snoRNA, snRNA: splicing, RNA bp modification• Transfer-Messenger RNAs (tmRNA), Small
cytosolic RNAs (scRNA), Guide RNAs (gRNA)• and more…
RNA structures
• 1°– Sequence (and modifications)
• 2°– Base pairing
• 3°– Overall Structure, non Watson-Crick
pairs– Experimental structures: tRNA,
ribosome
RNA Tertiary Structure, tRNA
Anticodon Loop
3’(aminoacyl) endCCA
Yeast Phenylalanine tRNA, 1.93A
Yeast Phenylalanine tRNA, 1.93A
rRNA small subunit, X. laevis
2° RNA structures
• Watson-Crick pairing -> helices
• Loop regions– Hairpin loops
– Internal loops
– Bulge loops
– Multibranch loops
RNA Modifications
Covalent Modifications-especially tRNAtRNA– rUrT, rrT, r, rD, rD, rS4U
– rC 3-CH3-C, 5-CH3-C
– rA I, 6-CH3-A, 6-isopentenyl-A
– rG 7-CH3-G, Q, Y
Nucleosides Nucleotides 1999 Jun-Jul;18(6-7):1579-81
RNA Base pairing
• G-C triple hydrogen bond• A -U double hydrogen bond• G-U single hydrogen bond
RNA structure energetics
• The number of GC versus AU and GU base pairs.– Higher energy bonds form more stable structures.
• Number of base pairs in a stem region.– Longer stems result in more bonds.
• Number of base pairs in a hairpin loop region.– Formation of loops with more than 10 or less than 5
bases requires more energy.
• Number of unpaired bases (interior loops or bulges).– Unpaired bases decrease the stability of the structure.
2° Structure
5’ 3’ G--C G--C C--GA | U--A G--CA AA A A A
“The Rules”
• Base Pairs -- Good– G:C better than A:T -- And local sequence
matters!
• Bulges, Loops -- Bad
• Many small interactions---Stable Structure
• Only predict “Canonical Interactions”
Base Pairs/Stacks
A UA U
A=UA=U
Basepair
G = -1.2 kcal/mole
A UU A
A=UU=A
Basepair
G = -1.6 kcal/mole
Base Pairing/Stacking
AAUU
-1.2 CGGC
-3.0
AU or UAUA AU
-1.6 GCCG
-4.3
AG, AC, CA, GAUC, UG, GU, CU
-2.1 GUUG
-0.3
CCGG
-4.8 XG, GXYU, UY
0
Bloomfield, Crothers, Tinoco, Physical Chemistry of Nucleic Acids
Hairpin Loops(GC closure)
N=3 +8
N=4,5 +5
N=6,7 +4
N=8,9 +5
N>=10 6+0.9(ln[N/10])
•Tertiary Interactions!
Internal Loops
G-X-CC-X-G
0
N=2-6 +2
N=7 +3
N>=8 3+0.9(ln[N/7])
5’ 3’ G--C G--C C--G A GG A A C T--A G--C T--A G--C
Single-Strand Bulges
5’ 3’ G--C G--C C--G A |G | A | T--A G--C T--A G--C
N=1 +3
N=2-3 +4
N=4-7 +5
N>=8 6+0.9ln(N/8)
Prediction Programs
• Mfold (M. Zuker)– 2° structure
• RNAstructure/OligoWalk– 2° structure, oligo/RNA target interactions
• alifold– 2° structure constrained by muliple
alignment.
• Pfold– 2° structure guided by rules derived from
known tRNA/rRNA structures
Prediction Programs
• Mfold (GCG)– M. Zuker
• Mfold input to Plotfold– Non-graphic output -G option– Graphics outputs
• SQUIGGLES• mountains• circles• domes• energy plots
Squiggles
1
2040
60
CCA-3’OH
DOMES, MOUNTAIN, CIRCLES
MFOLDStructure Family
• Optimal & Suboptimal structures– Can ask for multiple structures
• Energy increment and “window size” increment.
• View individually.
• How variable are the structures?– Energy Plots
ENERGY PLOT
P-Num Plot
Prediction Quality
Forces in RNA folds
• Complementary molecular surfaces
• Bridging cations
• Pseudoknotting
• “kinetic traps” in folding– NOT always 2 first!
Annu Rev Biophys Biomol Struct 1999;28:57-73Proc Natl Acad Sci U S A 1998 Sep 29;95(20):11555-60
RNA Structure Probing
• Physical methods– X-ray diffraction, NMR
• Enzymatic methods– S1, Rnases (find ss and ds regions).
• Chemical modification– DMS…
• Mutagenesis– G:C=>C:C=>C:G
Ribozymes• Naturally occurring
– RNAaseP
– Group I introns
– Group II introns
– snRNA in the splicosome
• Artifical– Engineered/evolved in the lab from natural
ribozymes to have new substrate RNA.
– Cleave mRNA, drug-like action
• miRNA/siRNA– Translational/transcriptional gene silencing
Published by AAAS
T. A. Lincoln et al., Science 323, 1229 -1232 (2009)
Cross-replicating RNA enzymes