progress toward predicting viral rna structure from sequence: how parallel computing can help solve...

25
Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University of Oklahoma October 7, 2008

Upload: jessica-bradford

Post on 05-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Progress toward Predicting Viral RNA Structure from Sequence:

How Parallel Computing can Help Solve the RNA Folding Problem

Susan J. SchroederUniversity of Oklahoma

October 7, 2008

Page 2: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

We finished the genome map,

now we can’t figure out how to fold it! Science (1989) 243, p.786

Page 3: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

N

NN

N

NH2

O

OHO

HH

HH

PO

O

HO

O-

N

NH2

ON

O

OH

HH

HH NH

N

N

O

NH2N

O

OH

HH

HHO

PO O-

O

PO

O

O-

NH

O

ON

O

OHO

HH

HH

PO

O-

O

O-

N1

C2N3

C4

C5C6

N9

C8

N7

O6

N2 H21

H22

H8

C1'

H1N3

C2 N1

C6

C5C4

N4

O2

C

H42

H41

C1;

H5

H6G

N3

C2 N1

C6

C5C4

O4

O2

UH3

C1'

H5

H6N1

N3

C4

C5 C6

N9

C8

N7N6 H61

H62

H2

H8

C1'

A

RNA

Page 4: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Sequence Structure Function

Guanine RiboswitchBatey, R. et al, 2004 Nature vol. 432, p. 412

5’GGACAUAUAAUCGCGUGGAUAUGGCACGCAAGUUUCUACCGGGCACCGUAAAUGUCCGACUAUGUCCA

5’GCGGAUUUAG2M

CUCAGUDHUDHGGGAGAGCGM2CCAGAC0MUGOMAAGYAUPS

C5MUGGAGG7MUC C5MUGUGU5MUPSCGA1MUCCACAGAAU

UCGACCA

tRNA

Page 5: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

RNA Folding Problem

• Folding a polymer with negative charge

• Watson - Crick base pairing

• Hierarchical folding 2o 3o1o

Figure from Dill &Chan (1997) Nat. Struct. Biol. Vol. 4, pp. 10-19

Page 6: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

AAUUGCGGGAAAGGGGUCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUUGCAAAGGGUAUGGUAAUAAGCUGACGGACAUGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAGAUCUUCUGUUGAUAUGGAUGCAGUUCA

P 6b

P 6a

P 6

P 4

P 5P 5a

P 5b

P 5c

120

140

160

180

200

220

240

260

AAU

UGCGGGA

A

A

GGGGUCA

ACAGCCG UUCAG

U

ACCA

AGUCUCAGGGG

AAACUUUGAGAU

GGCCUUGCA A A G G

G U A UGGUA

AUA AGCUGACGGACA

UGGUCC

U

A

A

CCA CGCA

GC

CAAGUCC

UAAGUCAACAGAU C U

UCUGUUGAUA

UGGAUGCA

GU

UC A

2o 3o1o

Cate et al. 1996 Science 273:1678

Waring &Davies 1984Gene 28:277

Page 7: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Nussinov & Jacobson 1980 PNAS v 11 p 6310 Fig.1

RNA Secondary Structure has Helices and Loops

Page 8: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

An example of phylogenetic alignment and structure prediction for RNAse P

Frank et al. 2000 RNA v 6 p1895

Page 9: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

How Dynamic Programming Algorithms Calculate RNA Secondary Structure

• Stochastic context-free grammar defines possible base pairs as 1 of 4 possible cases

• Recursion statement finds maximum value for each small subset of RNA sequence

• Fill an array with scores for each substructure• Traceback through the

array to find the lowest

free energy structure• O(N2) memory storage• O(N3) runtime

Page 10: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Websites for Folding Algorithms to Predict RNA 2o

MFOLD http:// www.bioinfo.rpi.edu/applications/mfold

Vienna package http:// www.tbi/univie.ac.at/~ivo/RNA

RNAStructure http://rna.urmc.rochester.edu

SFOLD http://sfold.wadsworth.org

PKNOTS http:// selab.wustl.edu

STAR4.4 http://biology.leidenuniv.nl/~batenburg/STRAbout.html

Page 11: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Wuchty 2003, Nucl. Acids Res. v 31, p 1115 Fig. 7Gruber et al. 2008, Nucl. Acids Res. v 36, p. W73 Fig. 1

tree graph merged landscape

dots & parentheses

Representations of RNA Secondary Structure

Page 12: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

RNAStructure Predicts Secondary Structure Well

RNA

Lowest Go

Structure

Best Suboptimal Structure

average 73% 87% Group II introns 88% 94% tRNA 87% 97% 5 S rRNA 74% 96% Group I introns 69% 84% SRP RNA 66% 88% Rnase P 63% 76% 23 S rRNA 55% 61% (as domains) (74%) (88%) 16 S rRNA 44% 54% (as domains) (61%) (76%)

Mathews et al. (2004) Proc. Natl. Acad. Sci. vol. 101, pp. 7287-7292

Page 13: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Mathews, 2006, J. Mol. Biol. V359 p. 528 Fig. 2

Mfold and RNAStructure Sample Suboptimal Structures

Page 14: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

How Wuchty’s Algorithm is like a Tree

Page 15: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

STMV RNA folding problem

• Crystal structure to 1.8 Å resolution (Larson et al., 1998)

• 59% of the 1,058 RNA nucleotides are in helices• RNA is icosahedrally averaged• Identity of nucleotides in helices remains obscure• Structure of 41% of the RNA remains unknown

Figure reproduced from VIPER websiteReddy et al. 2001

Page 16: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Current model for STMV RNA

Larson, S. & McPherson, A. Current Opinions in Structural Biology, vol. 11, p. 61.

Page 17: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Nussinov algorithms for maximizing matches and blocks

Nussinov et al. 1978 SIAM v35, p. 71,78 Fig. 1, 5

Page 18: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Combinatorial Search of STMV RNA• Locate potential helical structures between pairing

bases i and j• Assemble non-overlapping potential helices (i,j) with

(p>j,q) or (k>i+l, k+l<q<j)• Nested searching identifies

“helices within helices”

• Over 144,000 perfect 6-pair helices, but

no possible simultaneous combination of 30 helices in STMV RNA

Page 19: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Many more possible structures contain 30 imperfect helices in the STMV sequence

Page 20: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Chemical modification data restrains possible base pairing

N3

C2 N1

C6

C5C4

O4

O2

UH3

C1'

H5

H6N1

N3

C4

C5 C6

N9

C8

N7N6 H61

H62

H2

H8

C1'

A

5’AAGAUGUAAACCAGGA3’

3’CUGCA AA GGUCCU5’

5’AAGAUGUAAACCAGGA3’

3’CUGCA AA GGUCCU5’

5’AAGAUGUAAACCAGGA3’

3’CUGCA AA GGUCCU5’

5’AAGAUGUAAACCAGGA

3’CUGCA AA GGUCCU

5’AAGAUGUAAACCAGGA

3’CUGCA AA GGUCCU

5’AAGAUGUAAACCAGGA

3’CUGCA AA GGUCCU

Page 21: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Including Results from Chemical Probing Improves Secondary Structure Prediction

Mathews et al. (2004) Proc. Natl. Acad. Sci. 101, 7290 Figure 1

LOWEST 26.3 %BEST 86.8 %

Folded with constraints from in vivo chemical modification

LOWEST 86.8 %BEST 97.4 %

E.coli 5S rRNA

Page 22: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

3 Restraints Can Change the Lowest Energy Fold

Native -341.1 kcal/molA527, A532, A537

restrained to be single stranded-341.0.kcal/mol

Page 23: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

∆GMFE

Wuchty

Zuker

Chemical data

Future data

# Helices>1mm

# Helices,1 mm

Goal

Free Energy Landscape of STMV RNA

Page 24: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

How can Parallel Computing Help Solve the RNA Folding Problem?

• Utilize tree structure of RNA secondary structure prediction

• Expand range of free energies that can be computed for an RNA free energy landscape

• Explore more possible RNA structures

Page 25: Progress toward Predicting Viral RNA Structure from Sequence: How Parallel Computing can Help Solve the RNA Folding Problem Susan J. Schroeder University

Acknowledgements • Deb Mathews, University California Riverside• David Mathews, University of Rochester• Jeanmarie Verchot-Lubicz, Oklahoma State University• Cal Lemke, Oklahoma University greenhouses

• Lab Members:

Xiaobo Gu, Steven Harris, Koree Clanton-Arrowood, Brina Gendhar, Shelly Sedberry, Brian Doherty, Ted Gibbons, Sean Lavelle, John McGurk, Becky Myers, Mai-Thao Nguyen, Samantha Seaton, Jon Stone

• Funding