update on next-generation sequencing mick watson
TRANSCRIPT
Update on Next-Generation Sequencing
Mick Watson
Next-generation sequencing• Ultra-highthroughput• Characterised by– Many millions of…– … short reads
• Dominated by 3 companies– Roche 454 FLX pyrosequencing – Illumina / Solexa (SBS)– LifeTech SOLiD / Ion Torrent
SEQUENCING – THE BIG BOYS
Roche 454
Adapters added to SS DNA Each DNA molecule attached to a single bead
Emulsion PCR – molecule amplified to several million per
bead
PicoTiterPlate - ~29um wells
Each bead into a single wellNucleotides washed across the plate, emitting a flash as
the base is incorporated
Image analysis – each spot represents a flash of light; a base
being incorporated
Roche 454• Complex emulsion PCR library prep
– Read lengths average 400-500bp– Typically 1 million reads per run– Run takes 10 hours– 400Mb output
• FLX+ out now• Read length average 700bp
– 700Mb output
• Both have paired-end• Homo-polymer problems
454: Homopolymer problems• 454 suffers from homopolymer problems• Can introduce frameshifts etc
ABI SOLiD • More complex system – “colour space”
DNA attached to beads Beads attached to slide
• Bases are incorporated as di-nucleotides• Four colours represent many different di-nucleotides• After each round, the primer is moved back to position n-1• Therefore each base is sampled twice• Therefore with the first base, and a colour chart, you can work out the sequence of bases from the colours
ABI SOLiD• Reads are “colourspace”– Colours indicate incorporation of dinucleotide
• Current machine is 5500xl• Read lengths are 35bp, 60bp or 75bp
– 75 bp (fragment)– 75 bp x 35 bp (paired-end)– Up to 60 bp x 60 bp (mate-paired)
• 4.8 billion paired-end reads per run• A run takes 7 days
Illumina Solexa technology
Fluorescently labelled nucleotides
are added
Laser captures image to determine
first base
Fluorescently dyed nucleotides are
added
Laser captures image to determine
second base
Illumina GA IIx• Read lengths are 36, 57, 78, 101 and 150bp• Paired-end available• 30M reads per lane• 9Gb per lane• 8 lanes• 14 days for 150 pe
Illumina HiSeq 2000• Read lengths are 35, 50, 75 and 100bp • Paired-end available• 150M reads per lane• 30Gb per lane• 16 lanes (2*8)• 9 days for one flow cell• 11 days for two