lecture 15: hw2 feedback ultraconservation

35
http://cs273a.stanford.edu [Bejerano Fall10/11] 1

Upload: cybele

Post on 12-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Lecture 15: HW2 Feedback Ultraconservation. Ultraconserved Elements in the Human Genome: The Hip & The Hype. GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 1

Page 2: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 2

Lecture 15:

HW2 Feedback

Ultraconservation

Page 3: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 33

GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG

Ultraconserved Elements in theHuman Genome: The Hip & The Hype

Dept. of Developmental BiologyDept. of Computer Science

Stanford University

Gill Bejerano

Page 4: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 4

Sequence Conservation implies Function

(but whichwhich function/s?...)

human

mouse

mammalianancestor

...CTTTGCGA-TGAGTAGCATCTACTATTT...

...ACGTGGGACTGACTA-CATCGACTACGA...

functional region!

Comparative Genomics of related species highlights:

Page 5: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 5

HumanGenome:

3*109 letters

Human Genome full of Conserved Non-Coding Elements

1.5%known

function >50%junk

3x more functional DNA than known!

compare to other species

>5% human genome functional

~106 genomic loci do not code for protein

What do they do then?

Page 6: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 6

Conserved elements in the Human Genome

all human-mouse alignmentshuman-mouse ancestral repeats alignment

Difference: 5% of

Human Genome

[Mouse consortium, Nature 2002]

election

human-mouse ancestral repeats alignment

85%id on average

Page 7: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 7

Conserved elements in the Human Genome

all human-mouse alignmentshuman-mouse ancestral repeats alignment

Difference: 5% of

Human Genome

election

human-mouse ancestral repeats alignment

85%id on average

UltraconservationUltraconservation

Page 8: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 8

Typical DNA Conservation levels

Conserved elements between human and mouse are on average 85% identical. [mouse consortium, 2002]

(dot = base identical to human)

Page 9: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 9

Ultraconserved Elements

[Bejerano et al., Science 2004]

fish

481 elements perfectly conserved (100%id) over

200bp or more between human, mouse and rat.

using2 vs. 3species

Page 10: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 10

Contamination

Page 11: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 11

What exactly is an Ultraconserved Element?

Aha!!

using3 vs. 43species

Page 12: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 12

Ultraconservation as a Phenomenon

Few species More and more species

Hmmm….

Page 13: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 13

Ultraconserved Elements: Why?

Hundreds of long genomic regions identical between amniotes they must have rejected many different changes.

But... all functions we understand in our genome are encoded using redundant codes.

**

*

**

CDS ncRNA TFBS

seq.

Page 14: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 14

Conserved elements in the Human Genome

all human-mouse alignmentshuman-mouse ancestral repeats alignment

Difference: 5% of

Human Genome

election

human-mouse ancestral repeats alignment

85%id on average

UltraconservationUltraconservation

Why did I Why did I look at the tail?look at the tail?

Page 15: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 15

...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...

DNA Replication is Imperfect

It’s imperfect on all scales: small, medium and large.

In particular it begets novel functional entities:

...ACGTACGACTGACTAGCATCGACTACGA...

...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...

functionaljunk

functionalfunctional

functional’’ functional’

regionalduplication

functionaldivergence

Protein & RNA gene families come to life this way. What else does?

Page 16: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 16

Computational Approach I

Group them into paralog families of human functional regions of common origins: • Annotated members induce function on all. • Examine core, substitutions in family. • Test for “guilt by association”. [Bejerano et al., ISMB 2004]

.....ACGTGCATGACTGACTAGCATCAGACGACTAC..GATAATACGCTACGACTAGCTAC.....human DNA

...TGACTAGCATCGACTAC..GATAATACGAC... ...CATCGACTAC..GATAATACGACGGTTGGT...AC T

~400bp

Page 17: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 17

Functional Annotation by Families

[Bejerano et al., ISMB 2004]

Puzzling News:96% of the 700,000appear unique(!)

Good News:We still find12,027 families

novel putative ncRNAs, cis-regulatory elements, etc.

After removing from top 5% Human all annotated regions, and more:

700,000 elements, covering 3.5% Human Genome

Page 18: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 18

human

mouse

rat

related genesrelated elements(75%id over 200bp)

same element96%id over 200bp

same element95%id over 200bp

Computational Approach II

Classical Biological approach: experiment to understand these regions

Computational approach: how many regions like this or “better” are there?

Page 19: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 19

Out popped the Ultraconserved Elements

Puzzling News:96% of the 700,000

conserved non-codingelements appear

unique(!)

Same with Ultras

Page 20: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 20

What could ultras be doing?

•exonic•non•possibly

Page 21: Lecture 15: HW2 Feedback Ultraconservation

Associating distal peaks in a gene-based context is statistically inappropriate

21

Gene transcription start site

Ultraconserved Element

Ontology term (e.g. ‘development’)

http://cs273a.stanford.edu [Bejerano Fall10/11]

N = 8 genes in genome

K = 3 genes annotated with

n = 3 genes selected by proximal peaks

k = 2 selected gene annotated with

P = Pr(k ≥1 | n=2, K =3, N=8)

1.Set gene regulatory domain.

2.Associate Ultras with genes.

3.Per ontology term, count annotated genes selected.

4.Rank terms by enrichment hypergeometric p-value.

Evolved into

http://great.stanford.edu/

Page 22: Lecture 15: HW2 Feedback Ultraconservation

Enrichment Association of Ultraconserved Elements

22

Exo

nic

Ultr

asN

on

-exo

nic

Ultr

as

http://cs273a.stanford.edu [Bejerano Fall10/11]

Page 23: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 23

Ultras are Functional

Back in 2004 we hypothesized:

481 ultraconserved elements

exonic subset –

post transcriptional regulation

[Ni et al., Genes Dev.; Lareau et al., Nature, 2007]

“nonexonic” subset –

transcriptional regulators

[Pennacchio et al., Nature, 2006]

Page 24: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 24

Ultraconserved Non-coding RNA

[Calin et al, Cancer Cell, 2007]miRNA complementarity

About 1/3 of all ultras are expressed.

Some are predicted to provide microRNA targets.

A few are anti-correlated with miRNAexpression levels.

A few even act as oncogenes.

Page 25: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 25

Ultras are Under Strong Human Selection

Ultra DAF NonSyn DAF

[Katzman et al, Science ,2007]

Mutational cold spots? NO. Rare (new) mutations are introduced to the population.

Fierce purifying selection? YES. Very few of these get anywhere near fixation.

chimpA

humans

G AAA

Page 26: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 26

Touch an Ultra And You - DIY

[Ahituv et al., PLoS Biology, 2007]

Page 27: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 27

What can’t we measure in the lab?

sN

s

e ee

esN 21

1),|fixationPr(

Ne is population size, s selective dis/advantage.Both of which are VERY wrong in the lab.

Page 28: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 28

So it can happen – but does it FIX?

tDNA element

mouse

Page 29: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 29

Count Fraction Lost, Binned by %id

t human

macaque

dog

mouse

rat

100bp

sliding

window

count_all

count_hole

bin

by

%id

humandog rat mouse

macaque

Page 30: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 30

Quite Some Time Later

[McLean & Bejerano, Genome Res., 2008]

Page 31: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 31

Ultras are Fiercely Retained through Evolution

Ultras are

>300 fold

more

persistent

than

neutral DNA(25% deleted)

the genomic deletiongenomic deletion is

100%id primates-dog: 1,691,090bp

rodents deleted: 1,447bp (0.086%)

sN

s

e ee

esN 21

1),|fixationPr(

Page 32: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 32

How special are the Ultras?

election

UltraconservationUltraconservation

Page 33: Lecture 15: HW2 Feedback Ultraconservation

http://cs273a.stanford.edu [Bejerano Fall10/11] 33

Ultraconservation as a Phenomenon

Few species More and more species

Hmmm….

We do not see a bump in the curve

Page 34: Lecture 15: HW2 Feedback Ultraconservation

Ultraconserved Elements: What do we know?

• Excessive sequence conservation exists.• Set is heterogeneous from a functional perspective.• Four can be KO-ed with no clear phenotype.• Yet, the set is under extreme selection in natural

populations, both for mutations and deletions.• Most ultras have deep orthology, and no paralogy.• One ultra comes from a mobile element co-option events.• Others may have come from similar events.• Ultras appear the tip of a continuum, not a unique peak.

http://cs273a.stanford.edu [Bejerano Fall10/11] 34

Page 35: Lecture 15: HW2 Feedback Ultraconservation

Ultraconserved Elements: What we don’t

• What maintains so much conservation?

http://cs273a.stanford.edu [Bejerano Fall10/11] 35

**

*

**