© 2011 Illumina, Inc. All rights reserved.
Illumina, illuminaDx, Solexa, Making Sense Out of Life, Oligator, Sentrix, GoldenGate, GoldenGate Indexing, DASL, BeadArray, Array of Arrays, Infinium, BeadXpress, VeraCode, IntelliHyb, iSelect,
CSPro, GenomeStudio, Genetic Energy, HiSeq, HiScan, TruSeq, Eco, MiSeq and Nextera are registered trademarks or trademarks of Illumina, Inc. All other brands and names contained herein are the
property of their respective owners.
Illumina’s Suite of
Targeted Resequencing
Solutions
Colin Baron
Sr. Product Manager
Sequencing Applications
2
TruSeq™ Sample Prep Solutions Integrated workflow from sample to analyzed data
TruSeq Chemistry Clustering & Sequencing
TruSeq DNA Simple, scalable, and
cost effective
TruSeq RNA Optimized, gel-free, low
input
TruSeq Small RNA High-throughput miRNA
discovery & profiling
TruSeq Custom
& Exome Enrichment Lowest cost and most scalable
targeted resequencing
Nextera Low input, fast
3
TruSeq Exome Enrichment
Targets = 100,000s
TruSeq Custom Enrichment
Targets = 1000s
Multiplexed Amplicons
Targets = 100s
TruSeq Targeted Resequencing The simplest and most scalable targeted resequencing solutions
PCR Amplicons
Targets = 10s
4
TruSeq Targeted Resequencing A broad suite of tools for discovery or validation experiments
Option Amount of
sequence Best for Availability
TruSeq Exome
Enrichment ~62 Mb
Mendelian disease:
case-control exome studies, rarer
variants, causal variants
exome-wide linkage analysis
Now!
TruSeq Custom
Enrichment ~1 to ~10 Mb
GWAS follow-up: validation of
variants, variant discovery, pathways Mid-2011
TruSeq Custom
Amplicon Sub-500 Kb
Amplicon sequencing: high-
throughput CE experiments, ultra
deep seq, variant disc, screening
2H2011
Nextera + PCR
Amplicons
100’s of bp
targets
Amplicon sequencing: ultra-deep
sequencing, validation, screening, CE
replacement
Now!
5
Exome Sequencing Science Magazines “Top Breakthroughs of 2010”
Quantum motion machine
Synthetic Biology
Neanderthal Genome
HIV Prophylaxis
Exome Sequencing/Rare Disease Genes
Molecular Dynamics Simulations
Quantum Simulator
Next-Generation Genomics
RNA Reprogramming
The Return of the Rat
Published on December 16, 2010
6
Exome Approach Success Evident by the Number of Publications
Dramatic increase in number of exome publications over the past 3 years
Major focus has been on study of Mendelian Disease
– 2008: 1 publication
– 2009: 6 publications
– 2010: 66 publications
Huge increase in number of variants
7
TruSeq Exome Enrichment Pre-enrichment pooling and comprehensive coverage for the most
cost-effective exome
Most comprehensive exome solution
– High coverage uniformity
– Lowest DNA input
Plate-based processing for up to
96 samples
– Simple & scalable workflow
Pre-enrichment pooling of up to 6 samples
– Reduced hands-on time
– Decreased costs
– Gel-free protocol
Integrated with TruSeq DNA
Sample Prep Kits
– Optimized workflow
– Internal QC controls
Launch of TruSeq Exome
8
TruSeq Exome Enrichment Kit Most up-to-date and comprehensive exome available
Only empirically tested
probes are included
9
TruSeq Exome Enrichment Workflow Three-day assay with <4 hours hands-on time
TruSeq DNA Sample Prep
(1 ug starting input)
*PCR, cluster generation &
sequencing
* 2 successive rounds of enrichment
10
Internal Quality Controls For TruSeq exome enrichment
Internal controls for sample prep
and exome enrichment with full
software support
Library prep controls:
– Enzymatic activity of End Repair
– A-Tailing
– Ligation reactions
CTO (Custom Target Oligo)
– Set of specialized probes (150) in
capture pool targeting non-
polymorphic regions across high,
med, and low GC classes
– Also target known homo- &
heterozygous SNPs
CTE1 – Control Target End-repair 1
CTE2 – Control Target End-repair 2
CTA – Control Target A-tailing
CTL – Control Target Ligation
CTO – Custom Target Oligo
11
Highest Coverage Uniformity
Coverage uniformity for 6-plex sample pooling*
*HiSeq 2000 run data
% b
ases c
overe
d
>80% of targeted bases covered at
0.2x of the mean coverage
12
High On-Target Enrichment
Enrichment rates* for 6-plex sample pooling
*Percentage of reads mapping to target from total reads
HiSeq 2000 run data. +/- 150bp includes flanking up- & down-stream regions of target
13
Strategically designed to be larger than other commonly used methods
– Reduced cost – less probes; savings passed on to customer
– Better coverage uniformity; can tolerate more variance in fragment size
– Less issues with problematic regions ie. GC content
– Sum length of ~340K x 95-mer probes is only ~32 Mb, the enrichment actually targets
62Mb of the human genome, or 117.5Mb if the 150 bases up/down stream are taken
into account
Probe-fragment-library design Fully optimized with the most efficient design
14
Multiplexed sample enrichment A huge time and cost savings
Developed and optimized for 6 samples per enrichment reaction
– Huge increase in throughput
– Massively reduced FTE time
No impact on ability to call variants
15
Normalized coverage plots Determine amount of sequencing needed to achieve desired coverage
0.2
e.g. 90% of bases
covered at 10x
Mean Norm. Coverage
Desired Coverage =
16
Calculating sequence amount needed
200Gb 600Gb
Library Prep $54 $54
Enrichment $300 $300
Combo total $354 $354
Cluster Gen $259 $87
Sequencing (2 x 100 bp) $359 $120
Total Per Exome $972 $507
Breakdown of TruSeq Exome Costs*
(62Mb x 50 / 0.65) / .90 = 5.3Gb
2.4 exomes / lane at 200Gb = 38
7.2 exomes / lane at 600Gb = 115
*50x coverage per sample, processed on HiSeq at 2 x 100bp; list price.
Sum Regions Size x Coverage
Enrichment % = PF Gb
Eg. 50x
TruSeq
Exome:
Alignment
rate ÷
17
Alignment and variant calling with CASAVA
Target statistics, graphs
Enrichment %, read distribution, coverage,
normalized coverage plots, controls
Genome Studio integration for visualization
TruSeq Exome Data Analysis
18
Coming Soon! TruSeq Custom Enrichment Kits Same proven technology as in the TruSeq Exome Enrichment Kits
Target 1–10 Mb of DNA per sample
– Highest enrichment efficiency and
coverage uniformity
Leverages Illumina’s expertise in oligo
production
– Interactive online design software
– High coverage of targeted regions
Pre-enrichment Sample pooling
– Up to 12 samples per enrichment
reaction
– Reduced hands-on time; increased
throughput
Integrated with TruSeq DNA Sample
Prep Kits
– Fully optimized workflow
– Most cost-effective solution available
Early access
program has begun!
First custom order
expected summer
2011!
19
Coming Soon! TruSeq Custom Amplicon Sequencing Highly multiplexed, targeted amplicon resequencing
*Based on example study of 96 samples and 384 targets
Fully customized target probes and capture
– Based upon GoldenGate® Technology
Interactive probe design and ordering
– Streamlined user interface
– Rapid probe turnaround
Rapid & economical amplicon sequencing
– Up to 384 amplicons per sample
– Plate-based processing; 96 samples per
plate
– Assay time < 8 hours
– No additional hardware requirements
Up to 10× more cost effective than CE*
21
Assay
Biochemistry
TruSeq Custom Amplicon Assay Time 96 samples & 384 targets: from DNA to called variants in ~2 days
8am – Day 1
Hybridization
Setup
Oligos,
universal
reagents
Extension &
Ligation, PCR
Library
Normalization
Create pooled
Library,
normalize
Cluster Gen &
Sequencing
Pre-kitted
sequencing
reagents
Real-time
Analysis
Alignments,
variant calling
2pm – Day 1 5pm – Day 2
<8 hr assay with <3 hr hands-on time
No fragmentation required
No gel purification steps
No additional hardware
22
Simplest workflow and most convenient TRS offering One stop shop for entire targeted resequencing workflow
Fully integrated, end-to-end
solution including probe design,
sample prep, enrichment,
sequencing and data analysis
Multiplexed sample enrichment (up
to 12)
Master-mixed formulations and
plate-based processing for up to
96 samples
Internal quality controls for each
assay step from library prep
through enrichment with full
software support
Sales
Training
Sample Prep
Enrichment
Cluster gen
Seq
Analysis
Support
23
SIMPLE, FAST LIBRARY PREP IN
LESS THAN 2 HOURS
Transposon mediated library preparation
Closed tube DNA fragmentation
Ultra-low input requirements (50 ng)
ENABLES A RANGE OF CE AND NGS
APPLICATIONS
VALIDATED BY LEADING RESEARCHERS
Epicentre Nextera Technology for Library Prep Single Tube, Rapid Library Prep
26
TruSeq Sample Prep Kits for RNA & DNA
Master-mixed formulations & gel-free
RNA protocol
Universal adapter design with embedded index
Plate-based processing up to 96 samples;
volumes optimized for liquid handling
Low price, all-inclusive kit
Internal quality controls
Convenient, one-stop shop
Economical large-scale studies
Flexible Design—one kit for single,
paired-end, and mate paired reads
Robust indexing solution
High-throughput and automation
friendly
Simple workflow with minimal
pipetting and clean-up steps
Sample prep success monitoring
with software support
27
A Sequencer for Every Need. Every Budget. Every Lab.
Two proven technologies.
One powerful platform.
HiScanSQ
The most widely
cited platform, now at
half the price.
GAIIx MiSeq
My Samples. My
Study. MiSeq
Powerful.
Flexible.
Scalable.
HiSeq 1000
Redefining the
trajectory of
sequencing.
HiSeq 2000
28
Questions?
29
Using coverage uniformity curves to calculate output needed
Inputs:
• Size of Exome (no. of targeted bases) = 62Mb
• Enrichment efficiency (fraction of reads on target) = 65%
• Uniformity = 80% of targeted bases at 0.2x mean coverage
• Desired minimum % of bases at specified coverage = 80% at 10x
• Normalized mean coverage = ???
Mean sequencing coverage can be calculated from normalized coverage
plots
(normalized mean coverage) x (mean sequencing coverage) = 10x
(0.2) x (mean sequencing coverage) = 10x; therefore,
Mean sequencing coverage = 10x / 0.2 = 50x
The total amount of sequence can be calculated as follows:
• Total amount of target sequence = 62Mb
• Mean sequencing coverage = 50x
• Enrichment efficiency (fraction of reads on target) = 65%
(62Mb) x (50x) / (0.65) = 4.8Gb
30
Optimizing Coverage for Targeted Resequencing Determining the optimal amount of sequencing to achieve a desired coverage level
TruSeq exome product
specification: >80% of bases
covered at 0.2× mean coverage
– If average/mean coverage is
100x, then >80% of bases are
covered at 20x (20 reads per
base)
– If average/mean coverage is 50x,
then >80% of bases are covered
at 10x (10 reads per base)
Optimize coverage by leveraging:
– TruSeq Exome Scripts
– Mean normalized coverage
curves
Determine number of target bases
Calculate fraction of reads on target
Determine mean sequencing coverage with normalized coverage plots
Calculate required amount of sequencing data
31
TruSeq Exome Enrichment Kits – Pricing
Catalog
Number Product
Reactions
per kit
Samples
per kit (6-
plex)
Kit Price
Price per
sample
FC-121-1008 TruSeq Exome
Enrichment Kit – 8 8 48 $14,400 $300
FC-121-1024 TruSeq Exome
Enrichment Kit – 24 24 144 $39,600 $275
FC-121-1048 TruSeq Exome
Enrichment Kit – 48 48 288 $72,000 $250
FC-121-1096 TruSeq Exome
Enrichment Kit – 96 96 576 $129,600 $225
FC-121-1192 TruSeq Exome
Enrichment Kit - 192 192 1152 $230,400 $200
FC-121-1480 TruSeq Exome
Enrichment Kit - 480 480 2880 $504,000 $175
FC-121-1960 TruSeq Exome
Enrichment Kit – 960 960 5760 $864,000 $150
Orders from Nov 3, 2010; Shipping from Nov 22
33
TruSeq Exome Data Analysis Overview of outputs from TruSeq Exome Scripts―data visualization & QC
GC control probes
– Probes selected to have low, medium
or high GC content
– Independent of the probes targeting
the exome
Read distribution
– Shows the number of reads around the
center of the targeted regions
Coverage & mean coverage levels
– Shows the fraction of targeted bases
that are covered at a given coverage
level
– Compare data on the same scale from
runs that have different mean coverage
levels
Control SNPs
– Targeted at known SNPs that are not
in any targeted region
34
TruSeq Exome Data Analysis Leverage GenomeStudio® software for simple reporting on exome data
BED file seamlessly defines
targeted regions
‘Regions’ table allows for easy
selection of targets
Navigate regions and view
annotation data in
GenomeStudio’s viewer
View multiple samples in single
table or IGV window