when dna gets in the way: a cautionary note for dna ...1 z. zhou et al., extracellular rna in a...
TRANSCRIPT
![Page 1: When DNA gets in the way: A cautionary note for DNA ...1 Z. Zhou et al., Extracellular RNA in a single droplet of human serum reflects physiologic and disease states. Proc. Natl. Acad](https://reader035.vdocuments.us/reader035/viewer/2022071512/61329780dfd10f4dd73a8cd9/html5/thumbnails/1.jpg)
LETTER
When DNA gets in the way: A cautionary note forDNAcontaminationinextracellularRNA-seqstudiesJasper Verwilta,b,1, Wim Trypsteena,b, Ruben Van Paemela,b, Katleen De Pretera,b,Maria D. Giraldezc,d, Pieter Mestdagha,b, and Jo Vandesompelea,b
With great interest, we read the paper by Zhou et al.(1) describing a methodology that enables extracel-lular RNA sequencing (exRNA-seq) from extremelylow input (Small Input Liquid Volume ExtracellularRNA Sequencing [SILVER-seq]). We were intriguedby the high number of detected genes comparedto our previous studies (2, 3) and noticed low repro-ducibility. We hypothesized that these observationscould originate from substantial DNA contamination.Therefore, we reanalyzed the SILVER-seq data (4) todetermine the extent of DNA signal in the sequenc-ing reads.
First, we analyzed the fraction of reads mapping tothe different genomic regions. We noticed that thesefractions closely resembled the distributions in thegenome (Fig. 1A). Specifically, fewer than 5% of thereads mapped to exonic regions, while our ownexRNA-seq data (3) showed an average of 35% ex-onic reads. Secondly, we analyzed reads mapping tospliced sequences, expecting them to be relativelyabundant in RNA. However, we found that readsmapping to spliced sequences made up only 0.22%of the total uniquely mapped reads, whereas, in ourown RNA-seq data, they represented 17.8%, about81-fold higher (Fig. 1B). Thirdly, we generated copynumber profiles for a female patient with breast cancer(SRR9094442) and a healthy male control (SRR9094547).The cancer patient’s profile showed a pattern withclear copy number changes (e.g., chromosomes5, 11, and 20), a result typically found using cell-free DNA data (Fig. 2A). The copy number profileof the male control displayed an almost flat copynumber profile, with chromosomes X and Y showinghalf the copy number levels of the autosomes (Fig.
2B), in line with the expectations of a normal control’scell-free DNA. Finally, strandedness assessment ofthe SILVER-seq reads could not unambiguously con-firm that the data come from RNA (Fig. 1C). Thismeans that either the library preparation methoddoes not preserve strand orientation of the fragments(which is not specified in the paper) or that the data arepredominantly coming from DNA. In an attempt to useonly reads that must originate from RNA, we looked atexRNA genes with reads mapping over splice junctionsand with transcripts per million higher than 5, as recom-mended by the authors (1). A median of only 560 genesper sample remain after filtering, or 44 times lowerthan reported.
Our reanalyses present evidence supporting thatthe majority of the SILVER-seq data are derived fromDNA, rather than exRNA. Although the authors per-formed a DNase treatment aimed to prevent this issue(1), no quality control was performed to verify its efficacy.We hypothesize that the amount of cell-free DNA wastoo high or that inhibitors present in serum precludedefficient enzymatic DNA removal. Moreover, the authorsdid not perform any data analysis evaluating the pres-ence of DNA signal in their sequencing data, as the onesreported here. Importantly, we emphasize that our ob-servations do not undermine the potential utility ofSILVER-seq. Our letter aims to serve as a reminder ofthe current limitations of RNA-seqworkflows on biofluidsand as a plea for extensive quality control of RNA-seqdata in general.
Data Availability StatementThe code used for data analysis is available on GitHub athttps://github.com/jasperverwilt/SILVER-Seq_comment (5).
aDepartment of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium; bOncoRNALab, Cancer Research Institute Ghent, 9000 Ghent,Belgium; cDigestive Diseases Unit, Virgen del Rocio University Hospital, 41013 Seville, Spain; and dOncoDigest Group, Institute of Biomedicineof Seville (IBiS), 41013 Seville, SpainAuthor contributions: J. Verwilt, W.T., R.V.P., P.M., and J. Vandesompele designed research; J. Verwilt and R.V.P. performed research;J. Verwilt and R.V.P. analyzed data; and J. Verwilt, W.T., R.V.P., K.D.P., M.D.G., P.M., and J. Vandesompele wrote the paper.The authors declare no competing interest.Published under the PNAS license.1To whom correspondence may be addressed. Email: [email protected].
18934–18936 | PNAS | August 11, 2020 | vol. 117 | no. 32 www.pnas.org/cgi/doi/10.1073/pnas.2001675117
LETTER
Dow
nloa
ded
by g
uest
on
Aug
ust 3
1, 2
021
![Page 2: When DNA gets in the way: A cautionary note for DNA ...1 Z. Zhou et al., Extracellular RNA in a single droplet of human serum reflects physiologic and disease states. Proc. Natl. Acad](https://reader035.vdocuments.us/reader035/viewer/2022071512/61329780dfd10f4dd73a8cd9/html5/thumbnails/2.jpg)
0.3541 0.34080.3051479574 reads 500564 reads426250 reads
0.0471
0.49440.4585
294533 reads
3051148 reads2820152 reads
plasma RNA seq data from (3) SILVER Seq serum
exonic intronic intergenic exonic intronic intergenic
0.2
0.4
0.6
map
pin
g r
atio
0.8220
0.1780
0.9978
0.0022
plasma RNA seq data from (3) SILVER Seq serum
nonsplice splice nonsplice splice0.00
0.25
0.50
0.75
1.00
1.25
map
pin
g r
atio
0.04140.0327
0.9259
0.4818
0.0205
0.4977
plasma RNA seq data from (3) SILVER Seq serum
failed same strand different strand failed same strand different strand0.0
0.3
0.6
0.9
map
pin
g r
atio
A
B
C
Fig. 1. Regional coverage, splice read fractions, and strandedness of the data. (A) Fractions of reads mapping to exonic, intronic, and intergenicregions. The average fractions and average number of reads are printed. The bottom and top dashed blue lines indicate the fraction of base pairsclassified as exonic (0.0427) and intronic/intergenic (0.479) in the genome. These numbers represent the fraction of reads mapping to exonic andintronic/intergenic regions, respectively, if they would originate from random locations in the genome. (B) Fraction of reads mapping to spliceand nonsplice regions. The average fractions are printed. (C) Strandedness of the data. Strandedness of the data (“same strand”) is expected tobe 1 for stranded data and 0.5 for unstranded data. The average fractions are printed.
Verwilt et al. PNAS | August 11, 2020 | vol. 117 | no. 32 | 18935
Dow
nloa
ded
by g
uest
on
Aug
ust 3
1, 2
021
![Page 3: When DNA gets in the way: A cautionary note for DNA ...1 Z. Zhou et al., Extracellular RNA in a single droplet of human serum reflects physiologic and disease states. Proc. Natl. Acad](https://reader035.vdocuments.us/reader035/viewer/2022071512/61329780dfd10f4dd73a8cd9/html5/thumbnails/3.jpg)
1 Z. Zhou et al., Extracellular RNA in a single droplet of human serum reflects physiologic and disease states. Proc. Natl. Acad. Sci. U.S.A. 116, 19200–19208 (2019).2 E. Hulstaert et al., Charting extracellular transcriptomes in The Human Biofluid RNA Atlas. bioRxiv, 10.1101823369 (5 November 2019).3 C. Everaert et al., Performance assessment of total RNA sequencing of human biofluids and extracellular vesicles. Sci. Rep. 9, 17574 (2019).4 Z. Zhou, S. Zhong, Data from “Extracellular RNA in a single droplet of human serum reflects physiologic and disease states”. Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE131512. Accessed 8 January 2020.
5 J. Verwilt, R. Van Paemel, Code for “When DNA gets in the way: A cautionary note for DNA contamination in extracellular RNA-seq studies”. GitHub. https://github.com/jasperverwilt/SILVER-Seq_comment. Deposited 28 January 2020.
Fig. 2. Copy number profiles generated from SILVER-seq data. Yellow segments indicate a lower copy number, and green segments indicate ahigher copy number. (A) Copy number profile of a female breast cancer patient. (B) Copy number profile of a healthy male.
18936 | www.pnas.org/cgi/doi/10.1073/pnas.2001675117 Verwilt et al.
Dow
nloa
ded
by g
uest
on
Aug
ust 3
1, 2
021