lecture 11

42
http://cs273a.stanford.edu [Bejerano Fall10/11] 1

Upload: hiroko-boyer

Post on 02-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Lecture 11. HW1 Feedback (ours) (Upcoming Project – discuss Wed) Non-Coding RNAs Halfway Feedback (yours). “non coding” RNAs. Central Dogma of Biology:. RNA is an Active Player:. reverse transcription. long ncRNA. What is ncRNA?. - PowerPoint PPT Presentation

TRANSCRIPT

http://cs273a.stanford.edu [Bejerano Fall10/11] 1

http://cs273a.stanford.edu [Bejerano Fall10/11] 2

Lecture 11

HW1 Feedback (ours)

(Upcoming Project – discuss Wed)

Non-Coding RNAs

Halfway Feedback (yours)

http://cs273a.stanford.edu [Bejerano Fall10/11] 3

“non coding” RNAs

4

Central Dogma of Biology:

5

RNA is an Active Player:

reverse transcriptionlong ncRNA

6

What is ncRNA?

• Non-coding RNA (ncRNA) is an RNA that functions without being translated to a protein.

• Known roles for ncRNAs:– RNA catalyzes excision/ligation in introns.– RNA catalyzes the maturation of tRNA.– RNA catalyzes peptide bond formation.– RNA is a required subunit in telomerase.– RNA plays roles in immunity and development (RNAi).– RNA plays a role in dosage compensation.– RNA plays a role in carbon storage.– RNA is a major subunit in the SRP, which is important in protein trafficking.– RNA guides RNA modification.

– In the beginning it is thought there was an RNA World, where RNA was both the information carrier and active molecule.

7

AAUUGCGGGAAAGGGGUCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUUGCAAAGGGUAUGGUAAUAAGCUGACGGACAUGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAGAUCUUCUGUUGAUAUGGAUGCAGUUCA

RNA Folds into (Secondary and) 3D Structures

P 6b

P 6a

P 6

P 4

P 5P 5a

P 5b

P 5c

120

140

160

180

200

220

240

260

AAU

UGCGGG

A

A

A

GGGGUCA

ACAGCCG UUCAG

U

ACCA

AGUCUCAGGGG

AAACUUUGAGAU

GGCCUUGCA A A G G

G U A UGGUA

AU

A AG

CUGACGGACA

UGGUCC

U

A

A

CCA CGCA

GC

CAA

GUCC

UAA

GUCAACAG

AU C U

UCUGUUGAU

A

UGGAU

GC

AGU

UC A

Cate, et al. (Cech & Doudna).(1996) Science 273:1678.

Waring & Davies. (1984) Gene 28: 277.

We would like to predict them from sequence.

RNA structure rules• Canonical basepairs:

– Watson-Crick basepairs:• G - C• A - U

– Wobble basepair:• G – U

• Stacks: continuous nested basepairs. (energetically favorable)

• Non-basepaired loops:

– Hairpin loop.

– Bulge.

– Internal loop.

– Multiloop.

• Pseudo-knots

Bafna 1

RNA structure: Basics

• Key: RNA is single-stranded. Think of a string over 4 letters, AC,G, and U.

• The complementary bases form pairs.• Base-pairing defines a secondary structure.

The base-pairing is usually non-crossing.

Ab initio structure prediction: lots of Dynamic Programming

• Maximizing the number of base pairs (Nussinov et al, 1978)

simple model:(i, j) = 1

Pseudoknots drastically increase computational complexity

http://cs273a.stanford.edu [Bejerano Fall10/11] 12

Nearest Neighbor Model for RNA Secondary Structure Free Energy at 37 OC:

C G U U U G G GUU

CACAAACG

-2 .0

-2 .1

-0 .9

-0 .9

-1 .8

-1 .6

+ 5 .0

Ghelix = GCGGC + GGUCA + 2GUUAA + GUGAC =

-2.0 kcal/mol - 2.1 kcal/mol + 2x(-0.9) kcal/mol - 1.8 kcal/mol = -7.7 kcal/mol

Ghairpin loop = Ginitiation (6 nucleotides) + GmismatchGGCA =

5.0 kcal/mol - 1.6 kcal/mol = 3.4 kcal/mol

Gtotal = G

hairpin + Ghelix = 3.4 kcal/mol - 7.7 kcal/mol = -4.3 kcal/mol

Mathews, Disney, Childs, Schroeder, Zuker, & Turner. 2004. PNAS 101: 7287.

Zuker’s algorithm MFOLD: computing loop dependent energies

http://cs273a.stanford.edu [Bejerano Fall10/11] 14

Energy Landscape of Real & Inferred Structures

1

Unfortunately…

– Random DNA (with high GC content) often folds into low-energy structures.

– What other signals determine non-coding genes?

http://cs273a.stanford.edu [Bejerano Fall10/11] 16

Evolution to the Rescue

http://cs273a.stanford.edu [Bejerano Fall10/11] 17

a a cg u u c c c cu c ua g a cc

S

S

S

S

S aSu L aL

S uSa L cL

S gSc L a

S cSg L c

S L

• Each derivation tree corresponds to a structure.

Stochastic context-free grammar (SCFG)

L

L

L

L

S aSu

S cSg

S gSc

S uSa

S a

S c

S g

S u

S SS

1. A CFG

S aSu

acSgu

accSggu

accuSaggu

accuSSaggu

accugScSaggu

accuggSccSaggu

accuggaccSaggu

accuggacccSgaggu

accuggacccuSagaggu

accuggacccuuagaggu

2. A derivation of “accuggacccuuagaggu” 3. Corresponding structure

Stochastic context-free grammar (cont’)

http://cs273a.stanford.edu [Bejerano Fall10/11] 20

http://cs273a.stanford.edu [Bejerano Fall10/11] 21

MicroRNA

Genomic context

known miRNAs in human

intergenic intronic

polycistronic

monocistronic

tRNA

tRNA Activity

http://cs273a.stanford.edu [Bejerano Fall10/11] 25

http://cs273a.stanford.edu [Bejerano Fall10/11] 26

http://cs273a.stanford.edu [Bejerano Fall10/11] 27

Human specific accelerated evolution

Chimp

Humanrapid change

conserved

28

Human Accelerated RegionsHuman-specific substitutions in conserved sequences

28[Pollard, K. et al., Nature, 2006] [Beniaminov, A. et al., RNA, 2008]

HumanDerived

Chimp

Humanrapid change

HAR1:• Novel ncRNA•Co-expressed in Cajal-Retzius cells with reelin.•Similar expression inhuman, chimp, rhesus.•18 unique human substitutionsleading to novel conformation.•All weak (AT) to strong (GC).

conserved

ChimpAncestral

http://cs273a.stanford.edu [Bejerano Fall10/11] 29

Other Non Coding Transcripts

http://cs273a.stanford.edu [Bejerano Fall10/11] 30

http://cs273a.stanford.edu [Bejerano Fall10/11] 31

mRNA

http://cs273a.stanford.edu [Bejerano Fall10/11] 32

EST

lincRNAs (long intergenic non coding RNAs)

http://cs273a.stanford.edu [Bejerano Fall10/11] 33

X chromosome inactivation in mammals

X X X Y

X

Dosage compensation

Xist – X inactive-specific transcript

Avner and Heard, Nat. Rev. Genetics 2001 2(1):59-67

http://cs273a.stanford.edu [Bejerano Fall10/11] 36

Microarrays, Next Gen(eration) Sequencing etc.

http://cs273a.stanford.edu [Bejerano Fall10/11] 37

End Results

http://cs273a.stanford.edu [Bejerano Fall10/11] 38

http://cs273a.stanford.edu [Bejerano Fall10/11] 39

http://cs273a.stanford.edu [Bejerano Fall10/11] 40

Transcripts, transcripts everywhere

Human Genome

Transcribed (Tx)

Tx from both strands

Leaky tx?

Functional?

Or are they?

http://cs273a.stanford.edu [Bejerano Fall10/11] 41

http://cs273a.stanford.edu [Bejerano Fall10/11] 42

Halfway Feedback