leicester - unifedocente.unife.it/.../iontorrent_ferrara_14042014_i.pdf · ion personal genome...

Post on 12-Jul-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Torrents of sequence

Pierpaolo Maisano Delser

pm244@le.ac.uk

Ferrara, 14th April 2014

October 2010

Leicester

Phylogeographically informative Haplotypes on the Autosomes and X chromosome (PHAX): blocks showing complete linkage disequilibrium with

evidence for no historical recombination events.

male

Random region on

the X chromosome

PHAX region on

the X chromosome

next

generation

next

generation

X

X

female

next

generation

next

generation

male

X = Recombination event

next

generation

next

generation X

female

X

next

generation

next

generation

male

X and Y chromosomes

PhD project

Yoruba

Hungarians

Irish English

Orcadians

Danish

Dutch

Norwegians

French

(CEU)

Spanish Turkish Palestinians

12 populations of 20 male individuals each. Yoruba (YRI) in orange, Middle East in green and Europe in blue.

PhD project

PHAX ID

Type Length (bp)

No of haplotypes defined by SNPs

No of SNPs (CEU population)

5574 non-coding 38,101 7 10

8913 non-coding 5,986 5 5

3115 non-coding 4,983 5 6

3 regions (different length) / 240 samples

The general aim of my project is to describe genetic diversity and to infer human population history in Western Europe.

PhD project

• Sanger sequencing? • 800-900 bp maximum length per sequence; • 49,000 bp per sample; • 55-65 sequencing reactions per sample; • 13,200-15,600 sequencing reactions for the whole dataset.

• SNP typing approach?

• Ascertainment bias; • Rare and private alleles.

Future work

• Set up a multiplex reaction to type

microsatellites;

• Develop SNP typing approach;

• ...and/or sequencing approach;

29/09/2011

Ion Torrent PGM platform

PhD project

Ion Torrent PGM platform • Fast run time (< 3 hours run time);

• Barcoded samples,

• 200bp read-length library kit;

• Ion 316 chip (200Mb expected output);

• Up to 40 samples per run;

• Cost-effective for small scale project;

• The machine was in the lab on the other side of the corridor!

...why?

Outline

• Introduction;

• Ion Torrent platform – how does it work?;

• Library preparation;

• NGS data analysis;

• Pros and cons.

Amplicon sequencing project

Outline

• Introduction;

• Ion Torrent platform – how does it work?;

• Library preparation;

• NGS data analysis;

• Pros and cons.

Amplicon sequencing project

Introduction

Jonathan M. Rothberg

June 2000 – 454 Life Sciences;

November 2006 – one million base pairs of Neanderthal genome;

2007 – Roche Diagnostic buys 454 Life Sciences (US$154.9 million);

2007 – Ion Torrent inc.;

February 2010 – Ion Torrent platform (PGM);

August 2010 – Life Technologies buys Ion Torrent inc. (US$375 million + US$350);

25th June 2013 - Jonathan M. Rothberg has resigned his position with Life Technologies;

2014???

Introduction

Ion Personal Genome Machine (PGM) Ion Proton

Ion 314™ Chip

30-50 MB

Ion 316™ Chip

300/600 Mb

Ion 318™ Chip

600Mb/1Gb

Ion PI™ Chip

Up to 10 Gb

Ion PII™ Chip

Introduction - Application

Introduction

NGS Glossary

Read – Base pair information of a given length from a DNA or cDNA fragment contained in a sequencing library. Different sequencing platforms are capable of generating different read lengths. Single End Read – The sequence of the DNA is obtained from the 5’ end of only one strand of the insert. These reads are typically expressed as 1x “y”, where “y” is the length of the read in base pairs (ex. 1x100bp, 1x200bp). Output– The amount of data coming out from one run of sequencing with a specific chip. Depth of Coverage – The number of reads that spans a given DNA sequence of interest. This is commonly expressed in terms of “Yx” where “Y” is the number of reads and “x” is the unit reflecting the depth of coverage metric (i.e. 5x, 10x, 20x, 100x)

Introduction

NGS Glossary

Depth of Coverage – The number of reads that spans a given DNA sequence of interest. This is commonly expressed in terms of “Yx” where “Y” is the number of reads and “x” is the unit reflecting the depth of coverage metric (i.e. 5x, 10x, 20x, 100x)

7x 9x 11x

Chip Expected Sequencing Run Time Expected Output Expected Reads

200-base reads 400-base reads 200-base reads 400-base reads

Ion 314™ Chip v2

2.3 hr 3.7 hr 30-50 Mb 60-100 Mb 400-550 thousand

Ion 316™ Chip v2

3.0 hr 4.9 hr 300-600 Mb 600 Mb-1 Gb 2-3 million

Ion 318™ Chip v2

4.4 hr 7.3 hr 600 Mb-1 Gb 1.2-2 Gb 4-5.5 million

Ion PI™ Chip

2-4 hr 10 Gb 60-80 million

Introduction

Output depends on the read length

Outline

• Introduction;

• Ion Torrent platform – how does it work?;

• Library preparation;

• NGS data analysis;

• Pros and cons.

Amplicon sequencing project

Ion Torrent platform – how does it work?

• First “post-light” sequencing technology;

• Semiconductor technology;

• Differs from other NGS technologies in that no modified nucleotides or optics are used;

Ion 316™ Chip

Jason Gagliano Wake Forest University NanoMedica LLC

Ion Torrent platform – how does it work?

Jason Gagliano Wake Forest NanoMedica LLC

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

DNA ions sequence

Nucleotides flow sequentially over the Ion

chip;

One sensor per well per sequencing reaction;

Direct detection of natural DNA extension;

Millions of sequencing reactions per chip.

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Ion Torrent platform – how does it work?

Signal calling for one well

T, A, C, G

Outline

• Introduction;

• Ion Torrent platform – how does it work?;

• Library preparation;

• NGS data analysis;

• Pros and cons.

Amplicon sequencing project

Amplicon sequencing project

PHAX ID

Type Length (bp)

5574 non-coding 38,101

8913 non-coding 5,986

3115 non-coding 4,983

3 regions (different length) / 240 samples

Ion 316™ Chip

200 Mb

How many samples per run?

Coverage: ~100x Total length of DNA sequence per sample: ~50,000 bp # samples per run= 200,000,000/(50,000x100)= 40 samples per run with Ion 316 chip and 200bp read length kit

fragment ======================================== fragment + adaptors ~~~========================================~~~ SE read ---------> PE reads R1---------> <---------R2 unknown gap ..................................................

Amplicon sequencing project

Single-end (SE) or paired-end (PE) sequencing.

Ion Torrent technology: each read spans the whole fragment, so the length of the fragment must be equal (or lower) to the length of the read.

fragment ======================= fragment + adaptors ~~~=======================~~~ SE read ------------------------------------>

Amplicon sequencing project – Library preparation

DNA source and quantity library prep

template preparation sequencing

Amplicon sequencing project – Library preparation

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Library preparation

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – AmpliconSeq

AmpliconSeq = sequecing of amplicons For each sample, PCR of:

PHAX ID

Type Length (bp)

5574 non-coding 38,101

8913 non-coding 5,986

3115 non-coding 4,983

Split in eight amplicons of ~ 5kb each

one amplicon

one amplicon

Total: ten amplicons for each sample

2 μl of stock 2 μl of 1:5 dilution

Amplicon sequencing project – AmpliconSeq

6000 5000

6000 5000

Pool the amplicons (equimolar concentration) for each sample.

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Shearing

Two main methods are used: • Sonication: hydrodynamic shearing using acoustic energy; bubbles are formed

in solution, when they explode they break the DNA;

• Enzymatic reaction: enzymes randomly cut the dsDNA in a time-dependent manner;

• We need fragments as short as.....???

modified from Mardis 2008 Annu Rev Genomics Hum Genet 9:387-402

The read-length! i.e. 200-bp read length → 200bp fragments

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template prepration

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Size selection I

Original protocol: one size selection for each sample on the gel. Modified protocol: magnetic beads.

Amplicon sequencing project – Size selection I

Depending on the ration beads solution/dna solution, fragments with different length will attach to the beads.

Amplicon sequencing project – Size selection I

DNA solution 100ul

766

500

350 300 250 200

150

100 75

50

25

0.8X beads solution = 80ul, fragments greater than 350 bp will attach to the beads

100 ul 80 ul

buffer beads

Amplicon sequencing project – Size selection I

Amplicon sequencing project – Size selection I

+

0.4x beads solution, 40ul (0.8x already in solution) =1.2x overall

40 ul

Amplicon sequencing project – Size selection I

Amplicon sequencing project – Size selection I

+ 86 bp = 286 bp

Before size selection I (only shearing)

After size selection I

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Size selection I

• adapters: 32-43bp fragments which contain primer sites for amplification and are needed to link the fragment with the support (slide, bead).

• barcodes/indexes: 10-13bp fragments which carry a unique sequence; they are used to distinguish samples run in the same chip

• Adapters and barcodes are

combined into one fragment

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Amplification

• Amplification: needed to increase the concentration of the fragments which positively incorporated adapters

modified from Mardis 2008 Annu Rev Genomics Hum Genet 9:387-402

Not always needed...if you have enough material you can jump to the quantification step.

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Library pool + size selection II

40 samples → 40 libraries with 40 different barcodes

Library pool (40 libraries together)

Library pool (40 samples)

Amplicon sequencing project – Library pool + size selection II

350

200

300 250

Library pool (40 samples)

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Quantification

1:2 1:4 1:8 1:16 1:2 1:4 1:8 1:16

Replicate 1 Replicate 2

Stock concentration Dilution to 26 pM

• clonal amplification by emPCR • template-positive ISPs

enrichment

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • Library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

Amplicon sequencing project – Library preparation

Amplicon sequencing project – Template preparation

modified from Metzker ML 2010 Nature Review Genetics

A adaptper Barcode

DNA fragment to be sequenced

P1 adapter

X

Amplicon sequencing project – Template preparation

Template-positive ISPs enrichment

sequencing

library prep • shearing • size selection I • adapter ligation and barcoding • amplification • library pool + size selection II • quantification

DNA source and quantity • whole genome • exome • custom enrichment • amplicon seq • RNA

template preparation

• clonal amplification by emPCR • template-positive ISPs

enrichment

Amplicon sequencing project – Library preparation

Amplicon sequencing project – Sequencing

Ion 316™ Chip

+ Sequencing primer

+ Sequencing polymerase

C H+

T

A H+

T A

Ion Torrent summary

Useful papers:

Rothberg JM et al., 2011, Nature – Ion Torrent paper;

Quail MA et al., 2012, BMC Genomics – Comparison between different NGS platforms;

Metzker ML, 2010, Nature Review Genetics – Sequencing technologies;

Mardis ER, 2008, Annu. Rev. Genom. Human Genet. – Next-generation DNA Sequencing Methods;

Lam H et al., 2012, Nature Biontechnology – HugeSeq pipeline.

Ion Torrent wet-lab summary

Introduction

Ion Torrent PGM Ion Torrent Proton

Chip Type PGM 314 PGM 316 PGM 318 Proton I Proton II

(~July-2013)

# of sensors 1.3M 6.3M 11M 165M 660M

Total output 10-40Mb 100-400Mb ~1Gb ~10Gb ~100Gb

Run time 1-2 hrs 1-2 hrs 1-2 hrs 2.5 hrs 2.5 hrs

Read length up to 400bp ~200bp up to 400bp ~200bp ~200bp

Total reads up to 0.6M up to 3M up to 6M 60-80M 240-330M

top related