resolving structural variants with long-read nanopore ... · current record spanning over 2...
TRANSCRIPT
Find more extraction protocol recommendations for your sample type, plus guidance on DNA storage and contaminants: community.nanoporetech.com/extraction_methods
Find out more about size selection methods for long-read sequencing:
community.nanoporetech.com/extraction_methods
EXTRACTION: obtaining high molecular weight DNA
LIBRARY PREPARATION: selecting a kit
Resolving structural variants with long-read nanopore sequencingStructural variants (SVs), defined as variants spanning 50 bp or more, account for ten times as many variant bases as single nucleotide polymorphisms (SNPs) in the human genome1. With known causative effects in an extensive range of both normal and aberrant phenotypes, the need to comprehensively characterise SVs is becoming increasingly clear. With Oxford Nanopore, long-read sequencing of native DNA greatly improves the accuracy of detection of even the largest of SVs, including those regions inaccessible to other technologies2.
Here, we present a simple workflow for an effective whole-genome SV survey from a human blood sample, using the PromethION™ sequencing device.
Bas
es s
eque
nced
Read length (kb)0 25 50 75 100
QIAGEN Gentra Puregene
Circulomics
QIAamp DNA Blood Midi KitQIAGEN Genomic-tip
60 min
High molecular weight gDNA
Optional fragmentationor size selection
End-prep and nick repair
Loading
Ligation of sequencingadapters T
pA
Ap
Selecting a suitable extraction method is often a trade-off between input requirements, expected fragment lengths, lab experience and hands-on time. To maximise the volume of data at long read lengths, we recommend the QIAGEN Gentra Puregene Blood Kit.
There is no upper read length limit in nanopore sequencing, with reads routinely spanning tens or hundreds of kilobases and the current record spanning over 2 megabases. Fragmentation is optional: unfragmented DNA offers a simple workflow, but shearing and size selection can improve read N50. Internal testing has yielded good results from the Circulomics Short Read Eliminator Kit for size selection, and the Diagenode Megaruptor 3 for shearing.
To prepare gDNA for sequencing, we recommend the Ligation Sequencing Kit (SQK-LSK109), providing the greatest throughput and control over read lengths. Gentle pipetting with wide-bore tips can also help preserve long DNA strands.
Find out more about sample prep, including rapid and multiplexing options: nanoporetech.com/products/kits
WORKFLOW: STRUCTURAL VARIATION
J000133_ON_SV_Best_Practice_FAW.indd 1 07/10/2019 16:44
References:1. E. E. Eichler, 2019. NEJM. DOI: 10.1056/NEJMra18093152. M. T. W. Ebbert et al. 2019. Genome Biol. DOI: 10.1186/s13059-019-1707-2
Oxford Nanopore Technologies, the wheel icon, EPI2ME and PromethION are registered trademarks of Oxford Nanopore Technologies in various countries. All other brands and names contained are the property of their respective owners. © 2019 Oxford Nanopore Technologies.
Oxford Nanopore Technologies products are currently for research use only. WF_1042(EN)_V1_01Oct2019
Twitter: @nanoporewww.nanoporetech.com
For a human whole-genome survey, generating sufficient data for ~30x depth of coverage is a good starting point; this can be obtained by sequencing on a single PromethION Flow Cell for 72 hrs. Throughput is maximised by a nuclease flush and addition of fresh library every 24 hrs. For best metrics, we recommend upping this depth to >45x, or for a more cost-effective, light-touch approach, this could be lowered to 15x.
The PromethION is available configured for 24 or 48 individually-addressable flow cells, leaving the rest of the device free for other experiments to be started and stopped as needed. Basecalling can be carried out in real-time on the device, making use of the powerful onboard compute.
*Circulomics Short Read Eliminator Kit + Diagenode Megaruptor 3
Find out more about PromethION: nanoporetech.com/products/promethion
Find out more about data analysis solutions: nanoporetech.com/nanopore-sequencing-data-analysis
SEQUENCING: generating high yields on the PromethION
ANALYSIS: calling SVs without command line
To call variants in your long-read data, we recommend pipeline-structural-variation, available on github. The pipeline takes the FASTQ files produced by onboard basecalling, aligns to a provided FASTA reference genome, calls insertions, deletions and duplications >50 bp and outputs a VCF file and QC report. Our SV pipeline tutorial provides step-by-step instructions for the full process, including visualisation. View the tutorial here: community.nanoporetech.com/knowledge/bioinformatics
For those wishing to avoid command line, our cloud-based EPI2ME™ SV workflow provides fully automated human whole-genome SV analysis based on the same pipeline, generating a VCF file.
Find out more about nanopore sequencing service providers: nanoporetech.com/services/providers
Find out more at: nanoporetech.com/applications/structural-variation
FASTQ input file
VCF output file
QC & filter reads
Align reads minimap2
Call variants sniffles
Filter variants
Bas
es s
eque
nced
Read length (kb)0 25 50 75 100
unfragmented
g-TUBE
SRE + MRIII*
J000133_ON_SV_Best_Practice_FAW.indd 2 07/10/2019 16:44