introduction to ngs - course list- dtu health...
TRANSCRIPT
Introduction to NGS
Gisle VestergaardAssociate Professor
Section of BioinformaticsTechnical University of Denmark
DTU Health Technology Bioinformatics
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Menu• Why are you all here today?• Why NGS?• How?• Why do we need bioinformatics for NGS?
3
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Life science data deluge
4
• Massive unstructured data from several areas DNA, patient journals, proteomics, imaging, ...
• Impacts Industry, Environment, Health
• Societal grand challenges• Cheap sequencing technologies
results in explosion of DNA data
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS 5
Reading the order of bases in DNA fragments
DNA sequencing
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Why NGS?
Transforming how we are doing biological science (and bioinformatics)
By
allowing experiments that could not have beendone before, and perform experiments much faster
6
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
How ?
by producing massive amounts of sequence data, really fast
7
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
First generation: Sanger• Fragment DNA• Clone into plasmid and amplify• DNA polymerase and only 1 primer• Sequence using dNTP + labelled ddNTPs (stops reaction)• Early method used radioactive labels, later came fluorescent tags• Run capillary electrophoresis/gel and “read” DNA code• Low output, long reads (~800-1200 nt), high quality
8
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
1st generation to NGS
• 454 Life Sciences. Bought by Roche 2007. Shut down 2013
• Illumina is currently cheapest per GB• Long-read sequencing outputs about 100 GB
9
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Sequencing costs
• Computer speed and storagecapacity is doubling every18 months and this rate is steady
• DNA sequence data is doubling faster thancomputer speeds!
10
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Human sequencing
• First draft genome of human in 2001, final 2004• Estimated costs $3 billion, time 13 years
• Today:–1000-2000$ for one genome–A couple of days!
11
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Storage and analysis• Cost of sequencing is comparable to cost of storage and
analysis• One Illumina HiSeq X Ten system: 18,000 human genomes
per year!• A standard human (30-40x) whole-genome sequencing exp.
would create 150 Gb of data
12
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Distributed data production
• World wide >900 centers• >60 Pb pr year (2014)• Data transfer and storage
becomes an issue
13
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
The X Genomes projects
• Human population projects–1000 genomes project (2500 individuals)–Genomics England (100k individuals)–US Precisions medicine (1 million individuals)
• 100K pathogens project, Earth Microbiome project, Cancer genome project, Plants and animals, Insects,…
14
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
NGS in the clinic• Diagnostics of patients (+ fetus)• Focused treatment of cancer patients• Sequencing of bacterial isolates• Country-wide projects:• UK, US, Qatar, Finland, China, …• DK: Danish regions want to sequence 100k individuals
15
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Personalized medicine
• Personalized medicine initiative in DK• Up to 70% of medicine does not work!• Sequence 100,000 patients on hospitals• Use extensive registry data• Current: 100 mill DKK (estimated 2 bill DKK)• National Genomics Institute
16
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
NGS & bioinformatics
• Extreme data size causes problems• Just transferring and storing the data• Standard comparisons fail (N*N)• Standard/old tools can not be used• Think in fast and parallel programs
17
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Novel technology allows novel questions
18
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
What can we use it for?
• Whole genome re-sequencing• Population genomics• Diagnostics • Cancer genomics• Ancient genomes• Metagenomics• RNA sequencing• Single cell sequencing• Genomic Epidemiology• anything with DNA
19
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Main pitfall of NGS• Finding a needle in a haystack is difficult…especially if you do not know that you are
looking for a needle
20
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
Summary
• Bioinformatic knowledge is needed to handle the amount of data• Sequencing technology keeps getting better, both in wet and dry
labs• Metagenomic data does no longer just describe genetic potential
21
DTU Sundhedsteknologi3. juni 2019 Introduction to NGS
How it works
22