leicester - unifedocente.unife.it/.../iontorrent_ferrara_14042014_i.pdf · ion personal genome...
TRANSCRIPT
October 2010
Leicester
Phylogeographically informative Haplotypes on the Autosomes and X chromosome (PHAX): blocks showing complete linkage disequilibrium with
evidence for no historical recombination events.
male
Random region on
the X chromosome
PHAX region on
the X chromosome
next
generation
next
generation
X
X
female
next
generation
next
generation
male
X = Recombination event
next
generation
next
generation X
female
X
next
generation
next
generation
male
X and Y chromosomes
PhD project
Yoruba
Hungarians
Irish English
Orcadians
Danish
Dutch
Norwegians
French
(CEU)
Spanish Turkish Palestinians
12 populations of 20 male individuals each. Yoruba (YRI) in orange, Middle East in green and Europe in blue.
PhD project
PHAX ID
Type Length (bp)
No of haplotypes defined by SNPs
No of SNPs (CEU population)
5574 non-coding 38,101 7 10
8913 non-coding 5,986 5 5
3115 non-coding 4,983 5 6
3 regions (different length) / 240 samples
The general aim of my project is to describe genetic diversity and to infer human population history in Western Europe.
PhD project
• Sanger sequencing? • 800-900 bp maximum length per sequence; • 49,000 bp per sample; • 55-65 sequencing reactions per sample; • 13,200-15,600 sequencing reactions for the whole dataset.
• SNP typing approach?
• Ascertainment bias; • Rare and private alleles.
Future work
• Set up a multiplex reaction to type
microsatellites;
• Develop SNP typing approach;
• ...and/or sequencing approach;
29/09/2011
Ion Torrent PGM platform
PhD project
Ion Torrent PGM platform • Fast run time (< 3 hours run time);
• Barcoded samples,
• 200bp read-length library kit;
• Ion 316 chip (200Mb expected output);
• Up to 40 samples per run;
• Cost-effective for small scale project;
• The machine was in the lab on the other side of the corridor!
...why?
Outline
• Introduction;
• Ion Torrent platform – how does it work?;
• Library preparation;
• NGS data analysis;
• Pros and cons.
Amplicon sequencing project
Outline
• Introduction;
• Ion Torrent platform – how does it work?;
• Library preparation;
• NGS data analysis;
• Pros and cons.
Amplicon sequencing project
Introduction
Jonathan M. Rothberg
June 2000 – 454 Life Sciences;
November 2006 – one million base pairs of Neanderthal genome;
2007 – Roche Diagnostic buys 454 Life Sciences (US$154.9 million);
2007 – Ion Torrent inc.;
February 2010 – Ion Torrent platform (PGM);
August 2010 – Life Technologies buys Ion Torrent inc. (US$375 million + US$350);
25th June 2013 - Jonathan M. Rothberg has resigned his position with Life Technologies;
2014???
Introduction
Ion Personal Genome Machine (PGM) Ion Proton
Ion 314™ Chip
30-50 MB
Ion 316™ Chip
300/600 Mb
Ion 318™ Chip
600Mb/1Gb
Ion PI™ Chip
Up to 10 Gb
Ion PII™ Chip
Introduction - Application
Introduction
NGS Glossary
Read – Base pair information of a given length from a DNA or cDNA fragment contained in a sequencing library. Different sequencing platforms are capable of generating different read lengths. Single End Read – The sequence of the DNA is obtained from the 5’ end of only one strand of the insert. These reads are typically expressed as 1x “y”, where “y” is the length of the read in base pairs (ex. 1x100bp, 1x200bp). Output– The amount of data coming out from one run of sequencing with a specific chip. Depth of Coverage – The number of reads that spans a given DNA sequence of interest. This is commonly expressed in terms of “Yx” where “Y” is the number of reads and “x” is the unit reflecting the depth of coverage metric (i.e. 5x, 10x, 20x, 100x)
Introduction
NGS Glossary
Depth of Coverage – The number of reads that spans a given DNA sequence of interest. This is commonly expressed in terms of “Yx” where “Y” is the number of reads and “x” is the unit reflecting the depth of coverage metric (i.e. 5x, 10x, 20x, 100x)
7x 9x 11x
Chip Expected Sequencing Run Time Expected Output Expected Reads
200-base reads 400-base reads 200-base reads 400-base reads
Ion 314™ Chip v2
2.3 hr 3.7 hr 30-50 Mb 60-100 Mb 400-550 thousand
Ion 316™ Chip v2
3.0 hr 4.9 hr 300-600 Mb 600 Mb-1 Gb 2-3 million
Ion 318™ Chip v2
4.4 hr 7.3 hr 600 Mb-1 Gb 1.2-2 Gb 4-5.5 million
Ion PI™ Chip
2-4 hr 10 Gb 60-80 million
Introduction
Output depends on the read length
Outline
• Introduction;
• Ion Torrent platform – how does it work?;
• Library preparation;
• NGS data analysis;
• Pros and cons.
Amplicon sequencing project
Ion Torrent platform – how does it work?
• First “post-light” sequencing technology;
• Semiconductor technology;
• Differs from other NGS technologies in that no modified nucleotides or optics are used;
Ion 316™ Chip
Jason Gagliano Wake Forest University NanoMedica LLC
Ion Torrent platform – how does it work?
Jason Gagliano Wake Forest NanoMedica LLC
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
DNA ions sequence
Nucleotides flow sequentially over the Ion
chip;
One sensor per well per sequencing reaction;
Direct detection of natural DNA extension;
Millions of sequencing reactions per chip.
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Ion Torrent platform – how does it work?
Signal calling for one well
T, A, C, G
Outline
• Introduction;
• Ion Torrent platform – how does it work?;
• Library preparation;
• NGS data analysis;
• Pros and cons.
Amplicon sequencing project
Amplicon sequencing project
PHAX ID
Type Length (bp)
5574 non-coding 38,101
8913 non-coding 5,986
3115 non-coding 4,983
3 regions (different length) / 240 samples
Ion 316™ Chip
200 Mb
How many samples per run?
Coverage: ~100x Total length of DNA sequence per sample: ~50,000 bp # samples per run= 200,000,000/(50,000x100)= 40 samples per run with Ion 316 chip and 200bp read length kit
fragment ======================================== fragment + adaptors ~~~========================================~~~ SE read ---------> PE reads R1---------> <---------R2 unknown gap ..................................................
Amplicon sequencing project
Single-end (SE) or paired-end (PE) sequencing.
Ion Torrent technology: each read spans the whole fragment, so the length of the fragment must be equal (or lower) to the length of the read.
fragment ======================= fragment + adaptors ~~~=======================~~~ SE read ------------------------------------>
Amplicon sequencing project – Library preparation
DNA source and quantity library prep
template preparation sequencing
Amplicon sequencing project – Library preparation
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Library preparation
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – AmpliconSeq
AmpliconSeq = sequecing of amplicons For each sample, PCR of:
PHAX ID
Type Length (bp)
5574 non-coding 38,101
8913 non-coding 5,986
3115 non-coding 4,983
Split in eight amplicons of ~ 5kb each
one amplicon
one amplicon
Total: ten amplicons for each sample
2 μl of stock 2 μl of 1:5 dilution
Amplicon sequencing project – AmpliconSeq
6000 5000
6000 5000
Pool the amplicons (equimolar concentration) for each sample.
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Shearing
Two main methods are used: • Sonication: hydrodynamic shearing using acoustic energy; bubbles are formed
in solution, when they explode they break the DNA;
• Enzymatic reaction: enzymes randomly cut the dsDNA in a time-dependent manner;
• We need fragments as short as.....???
modified from Mardis 2008 Annu Rev Genomics Hum Genet 9:387-402
The read-length! i.e. 200-bp read length → 200bp fragments
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template prepration
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Size selection I
Original protocol: one size selection for each sample on the gel. Modified protocol: magnetic beads.
Amplicon sequencing project – Size selection I
Depending on the ration beads solution/dna solution, fragments with different length will attach to the beads.
Amplicon sequencing project – Size selection I
DNA solution 100ul
766
500
350 300 250 200
150
100 75
50
25
0.8X beads solution = 80ul, fragments greater than 350 bp will attach to the beads
100 ul 80 ul
buffer beads
Amplicon sequencing project – Size selection I
Amplicon sequencing project – Size selection I
+
0.4x beads solution, 40ul (0.8x already in solution) =1.2x overall
40 ul
Amplicon sequencing project – Size selection I
Amplicon sequencing project – Size selection I
+ 86 bp = 286 bp
Before size selection I (only shearing)
After size selection I
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Size selection I
• adapters: 32-43bp fragments which contain primer sites for amplification and are needed to link the fragment with the support (slide, bead).
• barcodes/indexes: 10-13bp fragments which carry a unique sequence; they are used to distinguish samples run in the same chip
• Adapters and barcodes are
combined into one fragment
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Amplification
• Amplification: needed to increase the concentration of the fragments which positively incorporated adapters
modified from Mardis 2008 Annu Rev Genomics Hum Genet 9:387-402
Not always needed...if you have enough material you can jump to the quantification step.
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Library pool + size selection II
40 samples → 40 libraries with 40 different barcodes
Library pool (40 libraries together)
Library pool (40 samples)
Amplicon sequencing project – Library pool + size selection II
350
200
300 250
Library pool (40 samples)
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Quantification
1:2 1:4 1:8 1:16 1:2 1:4 1:8 1:16
Replicate 1 Replicate 2
Stock concentration Dilution to 26 pM
• clonal amplification by emPCR • template-positive ISPs
enrichment
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
Amplicon sequencing project – Library preparation
Amplicon sequencing project – Template preparation
modified from Metzker ML 2010 Nature Review Genetics
A adaptper Barcode
DNA fragment to be sequenced
P1 adapter
X
Amplicon sequencing project – Template preparation
Template-positive ISPs enrichment
sequencing
library prep • shearing • size selection I • adapter ligation and barcoding • amplification • library pool + size selection II • quantification
DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA
template preparation
• clonal amplification by emPCR • template-positive ISPs
enrichment
Amplicon sequencing project – Library preparation
Amplicon sequencing project – Sequencing
Ion 316™ Chip
+ Sequencing primer
+ Sequencing polymerase
C H+
T
A H+
T A
Ion Torrent summary
Useful papers:
Rothberg JM et al., 2011, Nature – Ion Torrent paper;
Quail MA et al., 2012, BMC Genomics – Comparison between different NGS platforms;
Metzker ML, 2010, Nature Review Genetics – Sequencing technologies;
Mardis ER, 2008, Annu. Rev. Genom. Human Genet. – Next-generation DNA Sequencing Methods;
Lam H et al., 2012, Nature Biontechnology – HugeSeq pipeline.
Ion Torrent wet-lab summary
Introduction
Ion Torrent PGM Ion Torrent Proton
Chip Type PGM 314 PGM 316 PGM 318 Proton I Proton II
(~July-2013)
# of sensors 1.3M 6.3M 11M 165M 660M
Total output 10-40Mb 100-400Mb ~1Gb ~10Gb ~100Gb
Run time 1-2 hrs 1-2 hrs 1-2 hrs 2.5 hrs 2.5 hrs
Read length up to 400bp ~200bp up to 400bp ~200bp ~200bp
Total reads up to 0.6M up to 3M up to 6M 60-80M 240-330M