cse 527 lecture 17, 11/24/04 - university of washington · •“wobble pair” g - u ~ 1 kcal/mole...

29
CSE 527 Lecture 17, 11/24/04 RNA Secondary Structure Prediction

Upload: others

Post on 15-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

CSE 527Lecture 17, 11/24/04

RNA Secondary Structure Prediction

Page 2: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Outline

• What is it

• How is it Represented

• Why is it important

• Examples

• Approaches

Page 3: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

RNA Structure

• Primary Structure: Sequence

• Secondary Structure: Pairing

• Tertiary Structure: 3D shape

Page 4: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

RNA Pairing

• Watson-Crick Pairing

• C - G ~ 3 kcal/mole

• A - U ~ 2 kcal/mole

• “Wobble Pair” G - U ~ 1 kcal/mole

• Non-canonical Pairs (esp. if modified)

Page 5: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

A tRNA 3d Structure

Page 6: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

tRNA - Alt.Representations

Anticodon loop

Anticodonloop

3’

5’

Page 7: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

tRNA - Alt.Representations

Anticodon loop

Anticodonloop

3’

5’

Page 8: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Why?

• RNA’s fold,and function

• Nature useswhat works

Page 9: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Importance

• Ribozymes (RNA Enzymes)

• Retroviruses

• Effects on transcription, translation,splicing...

• Functional RNAs: rRNA, tRNA, snRNA,snoRNA, micro RNA, RNAi, riboswitches,regulatory elements in 3’ & 5’ UTRs, ...

Page 10: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

AC

U

G

C

A

G

G

G

A

G

C

AA

G

C

GA

G

G CC

U

C

UGC A

A

UG

A C

G

GU

G

CA

U

GA

G

A G

C

G UCU UU

U

C

A

A

CA

C UG

U

UA

U

G

G

A

A

G U

UU

G

GC

UA

GC

G U

U CU

AG

A G

C U

G

UG

A

C

A

C

U

GC

CG

C

GA

C

G

G GA

A

A

G

U A A

C

GG

G

CGG

C

G

A

GU

AA

A

C C

C

GA

UC CC

G

GU

G

A

A

U

AG

CC

U GA

A

A

A

A

CA

A

A

GU

A

CA CGG

G

A

UAC

G

Page 11: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

RNA Pairing

• Watson-Crick Pairing

• C - G ~ 3 kcal/mole

• A - U ~ 2 kcal/mole

• “Wobble Pair” G - U ~ 1 kcal/mole

• Non-canonical Pairs (esp. if modified)

Page 12: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Definitions• Sequence 5’ r1 r2 r3 ... rn 3’ in {A, C, G, T}

• A Secondary Structure is a set of pairs i•j s.t.

1. i < j-4

2. if i•j & i’•j’ are two pairs with i ≤ i’, then

A. i = i’ & j = j’, or

B. j < i’, or

C. i < i’ < j’ < j

First pair precedes 2nd,or is nested within it. No“pseudoknots.”}

Page 13: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Nested

Pseudoknot

Precedes

Page 14: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

A Pseudoknot A-C / \3’ - A-G-G-C-U U U-C-C-G-A-G-G-G | C-C-C - 5’ \ / U-C-U-C

Page 15: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Approaches toStructure Prediction

• Maximum Pairing+ works on single sequences+ simple- too inaccurate

• Minimum Energy+ works on single sequences- ignores pseudoknots- only finds “optimal” fold

• Partition Function+ finds all folds- ignores pseudoknots

Page 16: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Approaches, II

• Comparative sequence analysis+ handles all pairings (incl. pseudoknots)- requires several (many?) aligned, appropriately diverged sequences

• Stochastic Context-free GrammarsRoughly combines min energy & comparative, butno pseudoknots

• Physical experiments (x-ray crystalography, NMR)

Page 17: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Nussinov: Max Pairing

• B(i,j) = # pairs in optimal pairing of ri ... rj

• B(i,j) = 0 for all i, j with i ≥ j-4; otherwise

• B(i,j) = max of:

1. B(i+1,j)

2. B(i,j-1)

3. B(i+1,j-1) +(if ri pairs with rj then 1 else 0)

4. max { B(i,k)+B(k+1,j) | i < k < j }

Time: O(n3)

Page 18: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

3. they pair with each other,so 1 + best ri+1 ... rj-1

4.They pair, but not to each other;i pairs with k forsome i < k < j;so look at bestri ... rk + best rk+1 ... rj(don’t need to look atother k; why?)

“optimal pairing of ri ... rj” Several (overlapping, but exhaustive) possibilities

1.ri is unpaired; look at bestway to pair ri+1 ... rj

2.rj is unpaired; look at bestway to pair ri ... rj-1

i i+1

j

j

i

j-1

j

i+1

j-1

i

j

k

k+1

i

Page 19: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Pair-based EnergyMinimization

• E(i,j) = energy of pairs in optimal pairing of ri ... rj

• E(i,j) = ∞ for all i, j with i ≥ j-4; otherwise

• E(i,j) = min of:

• E(i+1,j)

• E(i,j-1)

• E(i+1,j-1) + e(ri, rj)

• min { E(i,k)+E(k+1,j) | i < k < j }

Time: O(n3)

energy of one pair

Page 20: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

• Detailed experiments show it’smore accurate to model basedon loops, rather than just pairs

• Loop types1. Hairpin loop

2. Stack

3. Bulge

4. Interior loop

5. Multiloop

Loop-based EnergyMinimization

1

2

3

4

5

Page 21: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

thymine

cytosine

adenine

uracil

Base Pairs and Stacking

guanine

Page 22: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations
Page 23: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

LoopExamples

Page 24: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Zuker: Loop-basedEnergy, I

• W(i,j) = energy of optimal pairing of ri ... rj

• V(i,j) = as above, but forcing pair i•j

• W(i,j) = V(i,j) = ∞ for all i, j with i ≥ j-4

• W(i,j) = min(W(i+1,j), W(i,j-1), V(i+1,j-1), min { E(i,k)+E(k+1,j) | i < k < j } )

Page 25: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Zuker: Loop-basedEnergy, II

• V(i,j) =min(eh(i,j), es(i,j)+V(i+1,j-1), VBI(i,j), VM(i,j))

• VM(i,j) = min { W(i,k)+W(k+1,j) | i < k < j } )

• VBI(i,j) = min { ebi(i,j,i’,j’) + V(i’, j’) | i < i’ < j’ < j & i’-i+j-j’ > 2 }

hairpin stackbulge/

interiormulti-loop

Time: O(n4) O(n3) possible if ebi(.) is “nice”

bulge/interior

Page 26: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Suboptimal Energy

• There are always alternate folds with near-optimalenergies. Thermodynamics predicts that populations ofidentical molecules will exist in different folds; individualmolecules even flicker among different folds

• Zuker’s algorithm can be modified to find suboptimalfolds

• McCaskill gives a more elaborate dynamic programmingalgorithm calculating the “partition function,” whichdefines the probability distribution over all these states.

Page 27: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Two competing secondary structures for theLeptomonas collosoma spliced leader mRNA.

Page 28: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

Example ofsuboptimal

folding

Black dots: pairsin opt fold

Colored dots:pairs in folds2-5% worsethanoptimal fold

Page 29: CSE 527 Lecture 17, 11/24/04 - University of Washington · •“Wobble Pair” G - U ~ 1 kcal/mole •Non-canonical Pairs (esp. if modified) A tRNA 3d Structure. tRNA - Alt. Representations

A “Mountain” diagram