human genome project 101 human genome program, u.s. department of energy, genomics and its impact on...

47
HUMAN GENOME PROJECT HUMAN GENOME PROJECT 101 101

Upload: ethelbert-lindsey

Post on 14-Jan-2016

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

HUMAN GENOME PROJECTHUMAN GENOME PROJECT101101

Page 2: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Page 3: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Human Genome Project

Begun in 1990, the U.S. Human Genome Project is a 13-year effort coordinated by the U.S. Department of Energy and the National Institutes of Health. The project originally was planned to last 15 years, but effective resource and technological advances have accelerated the expected completion date to 2003.

HGP goals are to:  ■ identify all the approximately 35,000* genes in human DNA,

■ determine the sequences of the 3 billion chemical base pairs that make up human DNA,

■ store this information in databases, ■ improve tools for data analysis, ■ transfer related technologies to the private sector, and ■ address the ethical, legal, and social issues (ELSI) that may

arise from the project.  

Page 4: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 5: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Human Genome DataHuman Genome Data

• Derived from the Human Genome Project

• sequence freeze date in anticipation of data release: 22 July 2000

• Release of First Draft Sequence of Human Genome :

Nature 409 (6822), 15 February 2001

Science 291 (5507), 16 February 2001

• Release of “Complete” Draft Sequence of Human Genome: April 2003

Page 6: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

GENE GENEIntragenic region

exons intronsinterspersed

repeatstandemrepeats

Fine Structure of Human Genomic DNA

ACGTTGTGTCGCTGATTAGCTAGACCAAGATAGTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATATATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTAGATG

Page 7: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

3.2 billion nucleotides

The

Human

Genome

How many genes?

Page 8: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 9: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

>100,000 < 40,000

Page 10: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

But think of allall our traits, Jim-

bo!

Ours?! Are you of my species?

Page 11: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Get lost, punk!

Ouch!

Page 12: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

The

Human

GenomeACGTTGTGTCGCTGATTAGCTAGACCAAGATAGTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATACGCWHENISAGENEAGENETATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGC

Comparative Genomics (Alignment)

Gene Prediction

Experimental Discovery (Genetics)

Page 13: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Alignment

Page 14: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

CTCGCTGACTCAATCGGATTATGCTAGTCG

GCCCCCCCCCCCCTGAGTCAGGGGGGCTCGCTGCTGTGCTG

TGACTCAATCGGATTATGCTAGTCG

ATAGCCTAATAGCTGACTCAATCGGATTATGCTAGTCG

ATTTTTTTGACTCAATCGGATTA

CGGGGTGACTCAATCGGA

AAAAATATATTGACTCAATCGGATTATGCTAGTCG

GTCGTAGCTTGACTCAATCGGATTATGCTAGTCG

TCATATGACTCAATCGGATTATGCTAGTCG

Page 15: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

CTCGCTGACTCAATCGGATTATGCTAGTCG

GCCCCCCCCCCCCTGAGTCAGGGGGGCTCGCTGCTGTGCTG

TGACTCAATCGGATTATGCTAGTCG

ATAGCCTAATAGCTGACTCAATCGGATTATGCTAGTCG

ATTTTTTTGACTCAATCGGATTA

CGGGGTGACTCAATCGGA

AAAAATATATTGACTCAATCGGATTATGCTAGTCG

GTCGTAGCTTGACGGAATCGGATTATGCTAGTCG

TCATATGACTCAATCGGATTATGCTAGTCG

Page 16: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

CTCGCTGACTCAATCGGATTATGCTAGTCG

GCCCCCCCCCCCCTGAGTCAGGGGGGCTCGCTGCTGTGCTG

TGACTCAATCGGATTATGCTAGTCG

ATAGCCTAATAGCTGACTCAATCGGATTATGCTAGTCG

ATTTTTTTGACTCAATCGGATTA

CGGGGTGACTCAATCGGA

AAAAATATATTGACTCAATCGGATTATGCTAGTCG

GTCGTAGCTTGACGGAATCGGATTATGCTAGTCG

TCATATGACTCAATCGGATTATGCTAGTCG

Page 17: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Gene Prediction

Page 18: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 19: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

TTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGAATATAAAGGATAGATTACATGTGATATATGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTATGGATTGCGCTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATAGCGATATGACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATATGATATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAAATAATATAAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGC

Page 20: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

TTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGAATATAAAGGATAGATTACATGTGATATATGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTATGGATTGCGCTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATGCGATATAACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATAGCGATATGACCCAGGGGGGATACGCTATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAATATATAAGAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGCTTCGCTATAGGCTATAGCGATATAACCCAGGGGGGATATGATATTAGGAGGAGAGATATAGGATAGATTACATGTGATATATAGGAGAGAGAAATAATATAAGAGAGAGAGATTTTTTCTCCTGGTAAAAAGCTCGCTTAGGATTGCGC

Page 21: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

GENE GENEIntragenic region

exons intronsinterspersed

repeatstandemrepeats

Gene Prediction Algorithmsbased on consensus nucleotide sequences of

•tata boxes and start codons

•stop codons

•splice junctions

•CpG islands

Page 22: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Comparative Gross Results from Model Genome

Projects

Page 23: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 24: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 25: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Humans have about 35,000 genes!

You were right.

So what’s new!

Page 26: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Human Genes

Page 27: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Surprising Findings = !!

• !! Only 35,000 genes• most genes in euchromatin• GC/AT patchiness• !! Gene density higher & intron

size smaller in GC-rich patches• !! 1.4% translated, 28%

transcribed• !! Origins of genes

Page 28: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Some Origins of Human Genes

• Most from distant evolutionary past(basic metabolism, transcription, translation,repli-cation fixed since appearance of bacteria and yeast)

• Only 94/1278 families vertebrate-specific• 740 are nonprotein-encoding RNA genes• many derive from partial genomes of viruses and

virus-like elements—genomic fossils• some acquired directly from bacteria

(rather than by evolution from bacteria)

Page 29: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 30: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Genomic Fossils

Page 31: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Genomic Fossils(also known as Molecular Fossils)

• interspersed repeats

• generated by integration of transposable elements or retrotransposable RNAs

• active contemporary modifier of some vertebrate genomes (mouse)

• formerly active modifier of human genome

• some as prevalent as 1.5 million copies

Page 32: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 33: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Alu ElementsType of Short Interspersed Nuclear Element (SINE)

• transcribed by RNA polymerase III • 3’ oligo dA-rich tail• found only in primates• 1,500,000 copies • derived from 7SL RNA gene• dimer-like structure• most retroposition occurred 40 mya

A/T-richregion and3’-UTR

direct repeats

5’ 3’

A-rich regionRNA polymerase

III Promoter

AAAnA B

31 bp

50-300 bpAlu

Page 34: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Reverse TranscriptionEssential for Retroposition and Proliferation of Retroelements

• Converts primed RNAs into cDNAs• catalyzed by RNA-dependent DNA pol

» (reverse transcriptase)

• pol encoded by retroviruses and active LINEs

Retroviral genomic RNA

Alu RNA

LINE RNA

Page 35: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 36: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Alu Subfamily Structure (millions of years)

Oldest [J]

Intermediate [S]

Youngest [Y]

Jo Jb (65)

S (50) Sq

Sp Sx

Sc Sg

Y (25)

Yb8 Ya5 Ya8

450,000 copies

50,000 copies

Alu Elements as Genomic Fossils

Page 37: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Alu Subfamily Structure

PS [J]: Primate-Specific. Abundant in all primates.65-70 mya: Early Prosimian (strepsirhini)

Page 38: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

AS [S]: Anthropoid-Specific (haplorhini) 50-60 mya One mutation difference than PS.

Alu Subfamily Structure

Page 39: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

CS[S]: Catarrhine-specific. Nine mutations arising30-40 mya: Platyrrhines (FN) (Marmoset)Catarrhine (DFN) (Macaque)

macaque

Alu Subfamily Structure

Page 40: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

HS [Y]: Human-specific. Five or more additional20-25 mya: Almost exclusively Hominids

Alu Subfamily Structure

Page 41: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Master Gene Model of RetropositionP. Deininger, M. Batzer, Trends in Genetics 8:307, 1992

2. Master mutation

1. Amplification

TIME (m.y.)CO

PY

N

UM

BER

3’5’

3’5’

Page 42: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Alu Subfamily Structure (millions of years)

Oldest [J]

Intermediate [S]

Youngest [Y]

Jo Jb (65)

S (50) Sq

Sp Sx

Sc Sg

Y (25)

Yb8 Ya5 Ya8

450,000 copies

50,000 copies

Alus as Genomic Fossils

Page 43: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

ALU INSERTIONS AND DISEASE

LOCUS DISTRIBUTION SUBFAMILY DISEASE REFERENCE

BRCA2 de novo Y Breast cancer Miki et al, 1996Mlvi-2 de novo (somatic?) Ya5 Associated with

leukemiaEconomou-Pachnis andTsichlis, 1985

NF1 de novo Ya5 Neurofibromatosis Wallace et al, 1991APC Familial Yb8 Hereditary desmoid

diseaseHalling et al, 1997

PROGINS about 50% Ya5 Linked with ovariancarcinoma

Rowe et al, 1995

Btk Familial Y X-linkedagammaglobulinaemia

Lester et al, 1997

IL2RG Familial Ya5 XSCID Lester et al, 1997Cholinesterase one Japanese family Yb8 Cholinesterase

deficiencyMuratani et al, 1991

CaR familial Ya4 Hypocalciurichypercalcemia and

neonatal severehyperparathyroidism

Janicic et al, 1995

C1 inhibitor de novo Y Complement deficiency Stoppa Lyonnet et al, 1990ACE about 50% Ya5 Linked with protection

from heart diseaseCambien et al, 1992

Factor IX a grandparent Ya5 Hemophilia Vidaud et al, 19932 x FGFR2 De novo Ya5 Apert’s Syndrome Oldridge et al, 1997GK ? Sx Glycerol kinase

deficiencyMcCabe et al, (personalcomm.)

Page 44: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

What’s New About Old Fossils?In the Human Genome

• Comprise nearly 50% of genome• 50% more Alu elements than were predicted by

molecular biology• scarce in highly-regulated regions (detrimental?)• enriched in GC regions (beneficial?)• little activity, but little scouring• occur frequently within exons• contribute to formation of genes encoding novel

proteins

Page 45: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
Page 46: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

3.2 billion bases

28% transcribed

<1.4% encodes protein

50% repeats

Only ~35,000 genes!

FEATURESFEATURESThe

Human

Genome

not many modern protein families

Page 47: HUMAN GENOME PROJECT 101 Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001

Humans have about 35,000 genes!

Well, then…How can you

explain human complexity?