pseudoknots - uni-bielefeld.de · pseudoknots: • hepatitis delta virus • group i introns (self...

33
RNA Pseudoknots RNA structure prediction and comparison

Upload: hoangdang

Post on 24-Mar-2019

219 views

Category:

Documents


0 download

TRANSCRIPT

RNA Pseudoknots

RNA structure prediction

and comparison

RNA Secondary Structures

� Nesting convention:

� Prerequisite for Dynamic Programming

� Pseudoknot:

lkjijlki

kilkbpjibp

<<<<<<

<

or

with ),( and ),( allFor

ljki <<<

Overview

� Pseudoknot topologies

� Pseudoknot examples

� Pseudoknot algorithms:

• PKNOTS

• pknotsRG

Pseudoknot topologies

� H-type pseudoknots

� Simple recursive pseudoknot

� Planar pseudoknots

� Complex non-planar knots

H-type pseudoknot and

simple recursive pseudoknot

Planar pseudoknots

� Bi-secondary structures

� Superposition of two disjoint secondary

structures (without knots)

� Allows for chained pseudoknots

� Book thickness (page number) = 2

Non-planar pseudoknots

� Crossing lines in plane

� Book thickness > 2

� Few known biological examples

� In general no coincidence with

algorithms

Pseudoknotted basepairs in vivo

116

96

69

402

21

Average # bp

14.4RNAse P

6.2Group I Intron

1.9SRP RNA

1.4SSU rRNA

0tRNA

% pseudoknot

Mathews et al. JMB(288),1999

Biological functions by example

� Local pseudoknots on mRNA :• signals for frameshifting, readthrough

• Replication control

� Ribozymes – catalytically active pseudoknots:• Hepatitis delta virus

• Group I introns (self splicing)

� Telomerase RNA

-1 Ribosomal Frameshifting

Alam et al. (1999) PNAS 96

The Torsional Resistance Model

Hepatitis delta virus ribozyme

� Circular genome

� Rolling circle replication → multiple

genomes → self-cleaving in genome

length pieces

� Fastest known self-cleaving ribozyme

HDV ribozyme

Ke et al. (2004) Nature 429

Computational prediction

� Proof: “general pseudoknot prediction in

energy based models is NP complete”

� Solution:

• Sacrifice optimality: Heuristics (e.g. Genetic

algorithms, stochastic simulations)

• Restrict energy model: Maximum weighted

matching (MWM)

• Restrict class of predictable pseudoknots

Algorithm: PKNOTS

� First DP algorithm for pseudoknot

prediction

� Extension of standard energy based

folding algorithms

� Allows for a wide class of pseudoknots

� Best known and widely used

� Rivas & Eddy, 1999

Graphical notation of recursions

� wx: best folding between position i and j

� vx: best folding between position i and j,

given that i and j pair

Graphical notation of recursions

� wx :

� vx :

Pseudoknot construction with

gap matrices

Gap matrices

PKNOTS recursions (part 1)

� wx : � vx :

PKNOTS recursion (part 2)

� vhx : � yhx :

PKNOTS recursion (part 3)

� zhx : � whx :

PKNOTS and non-planar knots

� Solvable:

� Not solvable:

Conclusion: PKNOTS

� Ambiguous !

� Analysis: O(n4) space and O(n6) time

� Limit: ~150 bases

� How complex can sequences < 150

bases fold?

pknotsRG

� Idea: Improve runtime by restricting

pseudoknots

� Dynamic programming algorithm

� Implements newer energy model

Simple (recursive) pseudoknot

� Two helices (a-a’,

b-b’) and three loops

(u,v,w)

� Recursive: (pseudoknotted)

structures in loops

possible

Boundaries of a pseudoknot

� 8 moving boundaries : O(n8) time

Canonization Rule 1

� |a| = |a’| and |b| = |b’|

� f = l – (e – i)

� h = j – (g – k)

• Implies no bulges in pseudoknot stem

Canonization Rule 2

� Helices a-a’ and b-b’ have maximal extent

� maxhel (i, j) : length of maximal helix from i to j

� e = i + maxhel (i, l)

� g = k + maxhel (k, j)

Canonization Rule 3

� Resolve possible overlap of maximal helices.

� For each (i, j) two moving boundaries (k, l) left

� O(n4) time and O(n2) space

Conclusion: pknotsRG

� Limit raised to over 800 nucleotides

� Predicts many biological known

structures

Pseudoknots: open problems

� More pseudoknots need to be

discovered

� Identify relevant classes

� Almost no knowledge about energy

parameters