![Page 1: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/1.jpg)
A Content-Addressable DNA Database with Learned
Sequence EncodingsKendall Stewart1, Yuan-Jyue Chen2, David Ward1, Xiaomeng Liu1,
Georg Seelig1, Karin Strauss2, and Luis Ceze1
1 University of Washington 2 Microsoft Research
1
![Page 2: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/2.jpg)
2
Search by Image
![Page 3: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/3.jpg)
3
Bus
Traditional Storage
CPU
Database
![Page 4: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/4.jpg)
4
Bus
Traditional Storage
CPU
Database
![Page 5: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/5.jpg)
5
Bus
Traditional Storage
CPU
Database
❌
![Page 6: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/6.jpg)
6
Bus
Traditional Storage
CPU
Database
❌
Von-NeumannBottleneck
![Page 7: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/7.jpg)
7
DNA Data Storage
![Page 8: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/8.jpg)
8
DNA Data Storage
![Page 9: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/9.jpg)
9
DNA Data Storage
![Page 10: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/10.jpg)
10
DNA Data Storage
![Page 11: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/11.jpg)
11
Input File
DNA Data Storage
Sequences
Full Data Encoding
ATCGA…TCGAT…GATAC{ {{ PrimerPrimer Payload
ATCGA…GATCT…GATACATCGA…TGACA…GATACATCGA…GTGTA…GATAC
Organick et al., Nat. Bio. 2018
![Page 12: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/12.jpg)
12
Input File Sequences
DNA Data Storage
Full Data Encoding
ATCGA…TCGAT…GATAC{ {{ PrimerPrimer Payload
ATCGA…GATCT…GATACATCGA…TGACA…GATACATCGA…GTGTA…GATAC
Can we probe the payloads?
Organick et al., Nat. Bio. 2018
![Page 13: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/13.jpg)
13
Input File Sequences
DNA Data Storage
Full Data Encoding
ATCGA…TCGAT…GATAC{ {{ PrimerPrimer Payload
ATCGA…GATCT…GATACATCGA…TGACA…GATACATCGA…GTGTA…GATAC
Robust encoding requires randomization,
redundancy, ECC, segmentation…
Can we probe the payloads?
Organick et al., Nat. Bio. 2018
![Page 14: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/14.jpg)
14
Input File
Content-Based Encoding
Content-Addressable Storage
ATCGA…GGACGGAATAC{ {Content-Based
Probe RegionFile ID
(Payload)
Metadata Sequence
![Page 15: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/15.jpg)
15
Input File
Feature Extraction [0.1, 0.2, -0.3, 0.8, …]
Feature Vector
Content-Addressable Storage
Sequence Encoding ATCGA…GGACGGAATAC
{ {Content-BasedProbe Region
File ID (Payload)
Metadata Sequence
![Page 16: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/16.jpg)
16
Input File
Feature Extraction [0.1, 0.2, -0.3, 0.8, …]
Feature Vector
Content-Addressable Storage
Sequence Encoding ATCGA…GGACGGAATAC
{ {Content-BasedProbe Region
File ID (Payload)
Metadata Sequence
Focus of this talk
![Page 17: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/17.jpg)
17
Input File
Feature Extraction [0.1, 0.2, -0.3, 0.8, …]
Feature Vector
Content-Addressable Storage
Sequence Encoding ATCGA…GGACGGAATAC
{ {Content-BasedProbe Region
File ID (Payload)
Metadata Sequence
“Address” in Feature Space
Focus of this talk
![Page 18: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/18.jpg)
18Feature Space (2D Projection)
Feature Space
![Page 19: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/19.jpg)
19Feature Space (2D Projection)
Feature Space
![Page 20: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/20.jpg)
20Feature Space (2D Projection)
Feature Space
![Page 21: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/21.jpg)
21Feature Space (2D Projection)
Feature Space
![Page 22: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/22.jpg)
22Feature Space (2D Projection)
Vector Quantization
![Page 23: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/23.jpg)
23Feature Space (2D Projection)
ATCGA…
GATCG…
TGTAT…GCTAT…
Vector Quantization
Reif & LaBean,DNA 6 (2000)
![Page 24: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/24.jpg)
24
Vector Quantization
GCTAT
GCTAT
GCTAT
GCTAT
CGATA
CGAGA
ATCGACGAGA
TGTATGATCG
![Page 25: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/25.jpg)
25Feature Space (2D Projection)
Vector Quantization
Neighbors in different clusters
![Page 26: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/26.jpg)
26
Similarity Preserving Encoding
GCTTT
GTTATGCTAA
CCTAT
CGATA
CGCGA
ATCGA
CGAGATGTAT
GATCG
![Page 27: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/27.jpg)
27
Similarity Preserving Encoding
GCTTT
GTTATGCTAA
CCTAT
CGATA
CGCGA
ATCGA
CGAGATGTAT
GATCG
Naive Encoding:[0.1, 0.7, 0.3, 0.8, …]
[0.00, 0.25) = A [0.25, 0.50) = T [0.50, 0.75) = C [0.75, 1.00] = G
ACTG?
![Page 28: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/28.jpg)
Semantic Hashing
Adapted from Salakhutdinov et al. 2007
Images
Binary Addresses
Similar inFeature Space
(Euclidean)
Similar in Address Space
(Hamming)
28
![Page 29: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/29.jpg)
Semantic Hashing
Images
DNA Sequences
Similar inFeature Space
(Euclidean)
Similar in Sequence Space
(Hybridization Yield)
29
![Page 30: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/30.jpg)
Semantic Hashing
Images
DNA Sequences
Similar inFeature Space
(Euclidean)
Similar in Sequence Space
(Hybridization Yield)
Training a neural network efficiently requires
differentiable operations
NUPACK calculation of hybridization yield is not
differentiable!
30
![Page 31: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/31.jpg)
Approximating Yield
AGTC
AGAC
31
![Page 32: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/32.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
32
![Page 33: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/33.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
33
![Page 34: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/34.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
1� u · v||u|| ||v||
cosine distance(u,v) =
34
![Page 35: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/35.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
Mean Cosine Distance = (0 + 0 + 1 + 0) / 4 = 0.25
35
![Page 36: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/36.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
Mean Cosine Distance = (0 + 0 + 1 + 0) / 4 = 0.25
Equal to Hamming Distance
when representations are “one-hot”
36
![Page 37: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/37.jpg)
Approximating Yield
AGTC
AGACA T C G
A T C G
0 1 2 3
0 1 2 3
Mean Cosine Distance = (0 + 0 + 1 + 0) / 4 = 0.25
Equal to Hamming Distance
when representations are “one-hot”
37
![Page 38: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/38.jpg)
Approximating Yield
A T C G
0 1 2 3
…
302927 28
AGTC CAGC…
Neural network outputs are not exactly one-hot
38
![Page 39: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/39.jpg)
Approximating Yield
A T C G
0 1 2 3
…
302927 28
AGTC CAGC…
But the approximation is good enough!
Random Pairs of Neural Net Outputs
Mean CosineDistance: 0.29
Neural NetOutput
ATCG
A T G C C T A C G G C T HammingDistance: 0.33
Mean CosineDistance: 0.33
Sequence
One-HotEncoding
ATCG
39
![Page 40: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/40.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
40
![Page 41: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/41.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
Mean Cosine Distance
41
![Page 42: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/42.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
Mean Cosine Distance
Approximate Yield
42
![Page 43: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/43.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
Mean Cosine Distance
Approximate Yield
Images Similar? (Distance < 0.2)
43
![Page 44: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/44.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
Mean Cosine Distance
Approximate Yield
Images Similar? (Distance < 0.2)
Cross-Entropy Loss
44
![Page 45: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/45.jpg)
Closing the LoopA T C G
0 1 2 3
…
302927 28
AGTC CAGC…
A T C G
0 1 2 3
…
302927 28
GTAC GGTA…
Mean Cosine Distance
Approximate Yield
Images Similar? (Distance < 0.2)
Cross-Entropy Loss
Gradients
45
![Page 46: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/46.jpg)
Database Strand Design
FP d(T) IP f(T) RP18 nt 18 nt19 nt 30 nt5 nt
Image 209
Allows Sequencing
Allows Sequencing
Allows Protection
46
![Page 47: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/47.jpg)
Database Strand Design
FP d(T) IP f(T) RP18 nt 18 nt19 nt 30 nt5 nt
Allows Protection
Double-stranded region prevents interference
47
![Page 48: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/48.jpg)
Query Strand Design
FP d(T) IP f(T) RP18 nt 18 nt19 nt 30 nt5 nt
f(Q)* RP[:6]* B30 nt 6 nt
48
![Page 49: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/49.jpg)
Query Strand Design
FP d(T) IP f(T) RP
f(Q)* RP[:6]* B
49
![Page 50: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/50.jpg)
Dataset ConstructionTargetsQueries TargetsQueries
50
![Page 51: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/51.jpg)
Queries TargetsQueries
Dataset Construction
FP d(T) IP f(T) RP
For each T out of 100 target images
51
![Page 52: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/52.jpg)
TargetsQueries
Dataset Construction
FP d(T) IP f(T) RPf(Q)* RP[:6]* BFor each Q out of 10 query images
TargetsQueries
52
![Page 53: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/53.jpg)
Experimental Results
53
Ideal Data
![Page 54: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/54.jpg)
Experimental Results
54
Chance Retrieval
![Page 55: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/55.jpg)
Experimental Results
55
Query Image:
![Page 56: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/56.jpg)
Experimental Results
56
All Queries
![Page 57: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/57.jpg)
Experimental Results
57
All Queries
![Page 58: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/58.jpg)
Experimental Results
58
Query Image:
![Page 59: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/59.jpg)
Experimental Results
59
Query Image: Query Image:
![Page 60: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/60.jpg)
Experimental Results
60
![Page 61: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/61.jpg)
Experimental Results
61
R2: -0.30 R2: 0.64
![Page 62: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/62.jpg)
Experimental Results
62
R2: -0.30 R2: 0.64
![Page 63: A Content-Addressable DNA Database with Learned Sequence ...kstwrt/pubs/dna24-slides.pdf · A Content-Addressable DNA Database with Learned Sequence Encodings Kendall Stewart1, Yuan-Jyue](https://reader033.vdocuments.us/reader033/viewer/2022053007/5f0aaa017e708231d42cbaea/html5/thumbnails/63.jpg)
Thank you!
63