next now -generation genomics: methods and applications for modern disease research
DESCRIPTION
Next Now -Generation Genomics: methods and applications for modern disease research. Aaron J. Mackey, Ph.D. [email protected] Center for Public Health Genomics Wednesday October 7 th , 2009 BIMS 853 Special Topics in Cardiovascular Research. “omic” Disease Research. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/1.jpg)
Aaron J. Mackey, [email protected]
Center for Public Health Genomics
Wednesday October 7th, 2009BIMS 853 Special Topics in Cardiovascular Research
![Page 2: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/2.jpg)
source: Francis Ouellette, OICR
“omic” Disease Research
![Page 3: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/3.jpg)
source: Francis Ouellette, OICR
![Page 4: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/4.jpg)
Basics of the “old” technology• Clone the DNA.• Generate a ladder of labeled (colored) molecules that
are different by 1 nucleotide.• Separate mixture on some matrix.• Detect fluorochrome by laser.• Interpret peaks as string of DNA.• Strings are 500 to 1,000 letters long• 1 machine generates 57,000 nucleotides/run• Assemble all strings into a genome.
source: Francis Ouellette, OICR
![Page 5: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/5.jpg)
Basics of the “new” technology• Get DNA.• Attach it to something.• Extend and amplify signal with some color scheme.• Detect fluorochrome by microscopy.• Interpret series of spots as short strings of DNA.• Strings are 30-300 letters long• Multiple images are interpreted as 0.4 to 1.2 GB/run
(1,200,000,000 letters/day). • Map or align strings to one or many genome.
source: Francis Ouellette, OICR
![Page 6: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/6.jpg)
Differences between platforms:
• Nanotechnology used.• Resolution of the image analysis.• Chemistry and enzymology.• Signal to noise detection in the software• Software/images/file size/pipeline• Cost $$$
source: Francis Ouellette, OICR
![Page 7: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/7.jpg)
Genome size: 3000 MbReq'd coverage: 6 12 25
3730 454 FLX Solexabp/read 600 250 32Reads/run 96 400,000 40,000,000 bp/run 57,600 100,000,000 1,280,000,000 #/runs req'd 312,500 360 59
Cost per run 48$ 6,800$ 9,300$ Total cost 15,000,000$ 2,448,000$ 544,922$
Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” http://tinyurl.com/5f3alk
3 Gb ==
source: Francis Ouellette, OICR
![Page 8: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/8.jpg)
NGS technologies
• Roche/454 Life Sciences• Illumina (Solexa)• ABI SOLiD• Helicos• Complete Genomics• Pacific Biosciences• Polonator
![Page 9: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/9.jpg)
Roche/454 pyrosequencing
![Page 10: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/10.jpg)
454 flowgram
454 has difficulty quantizing luminescence of long homopolymers;problem gets worse with homopolymer length
![Page 11: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/11.jpg)
Roche/454
• first commercially available NGS platform• long reads (most 100-500bp; soon 1000bp)• paired-end module available• relatively expensive runs• homopolymer error rate is high• common uses: metagenomics, bacterial
genome (re)sequencing• James Watson’s genome done entirely on 454• UVA Biology Dept. has one (Martin Wu)
![Page 12: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/12.jpg)
Illumina (Solexa)• 75 bp reads, PE• 150-250 bp fragments• 8 lanes per flowcell• ~3 Gbp per lane• < 5% error rate• available at UVA BRF DNA Core
![Page 13: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/13.jpg)
ABI SOLiD
![Page 14: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/14.jpg)
![Page 15: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/15.jpg)
SOLiD “color space”
![Page 16: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/16.jpg)
ABI SOLiD
• short reads (~35 bp)• cheapest cost/base• high fidelity reads (easy to detect errors)• Common uses: SNP discovery• 1000 genome project• with PET libraries, all applications within
reach …
![Page 17: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/17.jpg)
Comparing Sequencers
Roche (454) Illumina SOLiD
Chemistry Pyrosequencing Polymerase-based Ligation-based
Amplification Emulsion PCR Bridge Amp Emulsion PCR
Paired ends/sep Yes/3kb Yes/200 bp Yes/3 kb
Mb/run 100 Mb 1300 Mb 3000 Mb
Time/run 7 h 4 days 5 days
Read length 250 bp 32-40 bp 35 bp
Cost per run (total) $8439 $8950 $17447
Cost per Mb $84.39 $5.97 $5.81
source: Stefan Bekiranov, UVA
![Page 18: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/18.jpg)
Other NGS platforms
• Helicos (Stephen Quake, Stanford)– single molecules on slide– like Illumina, but no PCR, greater density
• Complete Genomics– sequencing factory– 10K human genomes/year, $10K each
• Pacific Biosciences – SMRT– DNA polymerase bound to laser/camera hookup– records a movie of DNA replication with fluoroscent
dNTPs as single strand moves through nanopore• Polonator (Shendure and Church)
– homebrew, $200K flowcell+laser machine– allows custom chemistry protocols
![Page 19: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/19.jpg)
NGS applications
• genome (re)sequencing– de novo genomes: 454 in Bact, small Euks– SNP discovery and genotyping (barcoded pools)– targeted, “deep” gene resequencing– metagenomics
• structural/copy-number variation– Tumor genome SV/CNV: Illumina/PET
• epigenomics – last week’s seminar• RNA-seq: now-generation transcriptomics • ChIP-seq: now-generation DNA-binding
![Page 20: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/20.jpg)
RNA-seq: RNA abundance
![Page 21: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/21.jpg)
![Page 22: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/22.jpg)
RNA-seq: alternative splicing
![Page 23: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/23.jpg)
RNA-seq
• “unbiased” digital measure of abundance– residual PCR artifacts? Helicos says “yes”
• larger dynamic range than microarray– depends on sequencing depth cost
• ability to see alt./edited transcripts– multiple AS sites confounded; 454?
• Total RNA vs. cDNA– 3’ end bias of cDNA– non-polyA transcripts in total RNA
![Page 24: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/24.jpg)
ChIP-seq: protein-DNA binding
![Page 25: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/25.jpg)
![Page 26: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/26.jpg)
![Page 27: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/27.jpg)
PET: Paired End Tag libraries
![Page 28: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/28.jpg)
PET applications
![Page 29: Next Now -Generation Genomics: methods and applications for modern disease research](https://reader035.vdocuments.us/reader035/viewer/2022081603/56814790550346895db4c19a/html5/thumbnails/29.jpg)
some things I didn’tget to talk about much:
• personal genome sequencing/medicine• microbial metagenomics• ENCODE/modENCODE projects• HapMap project• human 1000 Genome Project (1KGP)• targeted- and/or deep-resequencing• microRNAs, piRNAs, ncRNAs, …• SVs and CNVs (cancer)• read alignment issues (“mapability”)