high-throughput subset matching on commodity gpu-based … · subset match useful in many scenarios...
TRANSCRIPT
![Page 1: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/1.jpg)
High-Throughput Subset Matching on
Commodity GPU-Based Systems
Daniele Rogora∗ Michele Papalini$ Koorosh Khazaei∗
Alessandro Margara% Antonio Carzaniga∗ Gianpaolo Cugola%
presented by
Daniele Rogora
%Politecnico di Milano ∗Università della Svizzera italiana $Cisco Systems
Milano Lugano Paris
Italy Switzerland France
EuroSys 2017
1 / 30
![Page 2: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/2.jpg)
Subset Match
Useful in many scenarios
Social networks, Twitter
2 / 30
![Page 3: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/3.jpg)
Subset Match
Useful in many scenarios
Social networks, Twitter
Data Center management
2 / 30
![Page 4: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/4.jpg)
Subset Match
Useful in many scenarios
Social networks, Twitter
Data Center management
Service brokering
2 / 30
![Page 5: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/5.jpg)
Subset Match
Useful in many scenarios
Social networks, Twitter
Data Center management
Service brokering
Cloud 3.0
2 / 30
![Page 6: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/6.jpg)
Example
Subscribers Tag Set...
.
.
.
Daniele
{#football, #acmilan}
{#politics, #Italy}
Antonio {#politics, #USA}
{#chomsky}...
.
.
.
3 / 30
![Page 7: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/7.jpg)
Example
Subscribers Tag Set...
.
.
.
Daniele
{#football, #acmilan}
{#politics, #Italy}
Antonio {#politics, #USA}
{#chomsky}...
.
.
.
#politics, #USA
#Italy#politics, #USA,
#Italy#politics, #USA,
#Italy
3 / 30
![Page 8: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/8.jpg)
Example
Subscribers Tag Set...
.
.
.
Daniele
{#football, #acmilan}
{#politics, #Italy}
Antonio {#politics, #USA}
{#chomsky}...
.
.
.
#politics, #USA
#Italy#politics, #USA,
#Italy#acmilan, #closing,
#news, #football
3 / 30
![Page 9: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/9.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
4 / 30
![Page 10: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/10.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
4 / 30
![Page 11: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/11.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA}
4 / 30
![Page 12: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/12.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 1
4 / 30
![Page 13: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/13.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 11
4 / 30
![Page 14: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/14.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 111 1
4 / 30
![Page 15: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/15.jpg)
Tagsets Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
D = {politics, Italy, USA} 1 111 1
4 / 30
![Page 16: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/16.jpg)
Tagsets Representation
1 2 3 4 5 6 7 8 9 10
1 111 100 0 0 0
4 / 30
![Page 17: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/17.jpg)
Example
Subscribers Bit String...
.
.
.
Daniele
{#football, #acmilan}
{#politics, #Italy}
Antonio {#politics, #USA}
{#chomsky}...
.
.
.
#politics, #USA
#Italy#politics, #USA,
#Italy#politics, #USA,
#Italy
5 / 30
![Page 18: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/18.jpg)
Example
Subscribers Bit String...
.
.
.
k1
aaa1001101000aaa
0010010011
k2 1001000011
0000101000...
.
.
.
101101001110110100111011010011
5 / 30
![Page 19: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/19.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
6 / 30
![Page 20: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/20.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
Query stream
0110101100
6 / 30
![Page 21: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/21.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
Query stream
0110101100
6 / 30
![Page 22: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/22.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
Query stream
0110101100
Output
k2,k3,k5,k2match
6 / 30
![Page 23: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/23.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
Query stream
0110101100
Output
k2,k3,k5match-unique
6 / 30
![Page 24: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/24.jpg)
Model
Tagset
table
Bit String Keys
1000100000 k2
1010000100 k4,k2
0110100000 k3
0011100010 k6,k2
0010101000 k5,k2
0000100100 k2
Query stream
0110101100
Output
k2,k3,k5match-unique
The stream of filters is
intense: 6k queries/s
The database is huge:
212M tag sets
6 / 30
![Page 25: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/25.jpg)
A Complex Problem
database size
system 20M 40M 212M
MongoDB — — —
GPU-only, plain 0.40 0.20 0.04
GPU-only, plain with batching 11.50 6.30 1.20
CPU-only, fast prefix tree 21.10 14.00 4.30
CPU-only, state-of-the-art ICN 27.60 17.40 —
CPU-only, Tagmatch 3.90 3.40 0.68
Tagmatch 268.80 144.40 35.30
(throughput: thousand queries per second)
7 / 30
![Page 26: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/26.jpg)
A Complex Problem
database size
system 20M 40M 212M
MongoDB — — —
GPU-only, plain 0.40 0.20 0.04
GPU-only, plain with batching 11.50 6.30 1.20
CPU-only, fast prefix tree 21.10 14.00 4.30
CPU-only, state-of-the-art ICN 27.60 17.40 —
CPU-only, Tagmatch 3.90 3.40 0.68
Tagmatch 268.80 144.40 35.30
(throughput: thousand queries per second)
Rivest, 1976
7 / 30
![Page 27: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/27.jpg)
A Complex Problem
database size
system 20M 40M 212M
MongoDB — — —
GPU-only, plain 0.40 0.20 0.04
GPU-only, plain with batching 11.50 6.30 1.20
CPU-only, fast prefix tree 21.10 14.00 4.30
CPU-only, state-of-the-art ICN 27.60 17.40 —
CPU-only, Tagmatch 3.90 3.40 0.68
Tagmatch 268.80 144.40 35.30
(throughput: thousand queries per second)
7 / 30
![Page 28: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/28.jpg)
TagMatch
8 / 30
![Page 29: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/29.jpg)
First Approach: using GPUs
Kernel
9 / 30
![Page 30: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/30.jpg)
First Approach: using GPUs
Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
9 / 30
![Page 31: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/31.jpg)
First Approach: using GPUs
Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
9 / 30
![Page 32: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/32.jpg)
First Approach: using GPUs
Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
q
9 / 30
![Page 33: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/33.jpg)
First Approach: using GPUs
Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
q
thread i
if (si ⊆ q)
results.add(q)
9 / 30
![Page 34: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/34.jpg)
First Approach: using GPUs
Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
q0 q1 q2 q3 q4 . . . q255
thread i
for (q ∈ q0 . . . q255)
if (si ⊆ q)
results.add(q)
9 / 30
![Page 35: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/35.jpg)
First Approach: using GPUs
CPU: launch kernel
CPU: merge matches with keys
results
key
table
q0 q1 q2 q3 q4 . . . q255Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
9 / 30
![Page 36: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/36.jpg)
First Approach: using GPUs
CPU: launch kernel
CPU: merge matches with keys
results
key
table
q0 q1 q2 q3 q4 . . . q255Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
This is not fast enough
database size
system 20M 40M 212M
MongoDB — — -–
GPU-only, plain 0.40 0.20 0.04
GPU-only, plain with batching 11.50 6.30 1.20
CPU-only, fast prefix tree 21.10 14.00 4.30
CPU-only, state-of-the-art ICN 27.60 17.40 —
CPU-only, Tagmatch 3.90 3.40 0.68
Tagmatch 268.80 144.40 35.30
(throughput: thousand queries per second)
9 / 30
![Page 37: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/37.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
10 / 30
![Page 38: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/38.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
Bit String Keys
1000100000 k2
1010100100 k4,k2
0110100000 k3
0011000010 k6,k2
0011101000 k5,k2
0001100100 k2
10 / 30
![Page 39: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/39.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
Bit String Keys
1000100000 k2
1010100100 k4,k2
0110100000 k3
0011000010 k6,k2
0011101000 k5,k2
0001100100 k2
0001011100
10 / 30
![Page 40: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/40.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
Bit String Keys
1000100000 k2
1010100100 k4,k2
0110100000 k3
0011000010 k6,k2
0011101000 k5,k2
0001100100 k2
0001011100
10 / 30
![Page 41: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/41.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
Bit String Keys
1000100000 k2
1010100100 k4,k2
0110100000 k3
0011000010 k6,k2
0011101000 k5,k2
0001100100 k2
0001011100
10 / 30
![Page 42: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/42.jpg)
Partitioning
lots of filters share many bits...
we could filter out many filters efficiently and quickly...
Bit String Keys
1000100000 k2
1010100100 k4,k2
0110100000 k3
0011000010 k6,k2
0011101000 k5,k2
0001100100 k2
0001011100
and we can do that efficiently on the cpu, while preserving
batches
10 / 30
![Page 43: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/43.jpg)
Model{@POTUS,energy,policy}{@Chomsky,education}{@ggreenwald,NSA}⋆
.
.
.
input queries (stream)
q1= 010101 · · ·11
q2= 011111 · · ·01
q⋆
3= 001110 · · ·11
.
.
.
Bloom-filterencoding
⋆ “unique” query
pre
-pro
cess
CPU
0 none
1 010001 · · ·01 → P1
2001100 · · ·00 → P2001010 · · ·11 → P3001011 · · ·01 → P4
3000101 · · ·10 → P5
. . .
· · · · · ·191 . . .
partition table
su
bset
matc
h
GPU
P1
011011 · · ·01 ↔ 1010101 · · ·11 ↔ 2010101 · · ·01 ↔ 3
. . .
P2
001101 · · ·10 ↔ 62001101 · · ·01 ↔ 63001100 · · ·11 ↔ 64
. . .
.
.
.
.
.
.
tagset table
. . . ,q2
batch1 P1
. . . ,q2 ,q3
batch2 P2
. . . ,q1 ,q3
batch3 P3
.
.
.
key
loo
ku
p/r
ed
uce
CPU
1 → k1 ,k23 → k2 ,k6 ,k8
.
.
.
63 → k5 ,k8 ,k13
.
.
.
key table
q2 ,1,q2 ,3, . . .
results1
q2 ,63,q3 ,71, . . .
results2
q1 ,324,q3 ,99, . . .
results3
.
.
.
11 / 30
![Page 44: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/44.jpg)
Model{@POTUS,energy,policy}{@Chomsky,education}{@ggreenwald,NSA}⋆
.
.
.
input queries (stream)
q1= 010101 · · ·11
q2= 011111 · · ·01
q⋆
3= 001110 · · ·11
.
.
.
Bloom-filterencoding
⋆ “unique” query
pre
-pro
cess
CPU
0 none
1 010001 · · ·01 → P1
2001100 · · ·00 → P2001010 · · ·11 → P3001011 · · ·01 → P4
3000101 · · ·10 → P5
. . .
· · · · · ·191 . . .
partition table
su
bset
matc
h
GPU
P1
011011 · · ·01 ↔ 1010101 · · ·11 ↔ 2010101 · · ·01 ↔ 3
. . .
P2
001101 · · ·10 ↔ 62001101 · · ·01 ↔ 63001100 · · ·11 ↔ 64
. . .
.
.
.
.
.
.
tagset table
. . . ,q2
batch1 P1
. . . ,q2 ,q3
batch2 P2
. . . ,q1 ,q3
batch3 P3
.
.
.
key
loo
ku
p/r
ed
uce
CPU
1 → k1 ,k23 → k2 ,k6 ,k8
.
.
.
63 → k5 ,k8 ,k13
.
.
.
key table
q2 ,1,q2 ,3, . . .
results1
q2 ,63,q3 ,71, . . .
results2
q1 ,324,q3 ,99, . . .
results3
.
.
.
q1 →k3 ,k13 , . . .
q2 →k1 ,k2 ,k2 ,
k6 ,k8 ,k5 ,
k8 ,k13 , . . .
q⋆
3 →k9 ,k3 ,k37 ,
k3 ,k7 , . . .
.
.
.
results (stream)
merge
CPU 11 / 30
![Page 45: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/45.jpg)
{@POTUS,energy,policy}{@Chomsky,education}{@ggreenwald,NSA}⋆
.
.
.
input queries (stream)
q1= 010101 · · ·11
q2= 011111 · · ·01
q⋆
3= 001110 · · ·11
.
.
.
Bloom-filterencoding
⋆ “unique” query
pre
-pro
cess
CPU
0 none
1 010001 · · ·01 → P1
2001100 · · ·00 → P2001010 · · ·11 → P3001011 · · ·01 → P4
3000101 · · ·10 → P5
. . .
· · · · · ·191 . . .
partition table
su
bset
matc
h
GPU
P1
011011 · · ·01 ↔ 1010101 · · ·11 ↔ 2010101 · · ·01 ↔ 3
. . .
P2
001101 · · ·10 ↔ 62001101 · · ·01 ↔ 63001100 · · ·11 ↔ 64
. . .
.
.
.
.
.
.
tagset table
. . . ,q2
batch1 P1
. . . ,q2 ,q3
batch2 P2
. . . ,q1 ,q3
batch3 P3
.
.
.
key
loo
ku
p/r
ed
uce
CPU
1 → k1 ,k23 → k2 ,k6 ,k8
.
.
.
63 → k5 ,k8 ,k13
.
.
.
key table
q2 ,1,q2 ,3, . . .
results1
q2 ,63,q3 ,71, . . .
results2
q1 ,324,q3 ,99, . . .
results3
.
.
.
q1 →k3 ,k13 , . . .
q2 →k1 ,k2 ,k2 ,
k6 ,k8 ,k5 ,
k8 ,k13 , . . .
q⋆
3 →k9 ,k3 ,k37 ,
k3 ,k7 , . . .
.
.
.
results (stream)
merge
CPU
Partitioning
12 / 30
![Page 46: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/46.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
13 / 30
![Page 47: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/47.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
13 / 30
![Page 48: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/48.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
P Bit String
0
1010000100
0001101101
0000110100
0000010110
0000001110
1
1000100000
0110100000
0011100010
0010101000
0000110001
13 / 30
![Page 49: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/49.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
P Bit String
0
1010000100
0001101101
0000110100
0000010110
0000001110
1
1000100000
0110100000
0011100010
0010101000
0000110001
13 / 30
![Page 50: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/50.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
P Bit String
0
1010000100
0001101101
0000110100
0000010110
0000001110
1
1000100000
0110100000
0011100010
0010101000
0000110001
13 / 30
![Page 51: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/51.jpg)
Partitioning
Max size: 3
P Bit String
0
1000100000
1010000100
0110100000
0011100010
0010101000
0001101101
0000110100
0000110001
0000010110
0000001110
P Bit String
0
1010000100
0001101101
0000110100
0000010110
0000001110
1
1000100000
0110100000
0011100010
0010101000
0000110001
P Bit String
00001101101
0000110100
1
1010000100
0000010110
0000001110
2
0110100000
0011100010
0010101000
31000100000
0000110001
13 / 30
![Page 52: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/52.jpg)
Partitioning
P Mask Bit String
00001101101
0000110100
1
1010000100
0000010110
0000001110
2
0110100000
0011100010
0010101000
31000100000
0000110001
13 / 30
![Page 53: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/53.jpg)
Partitioning
P Mask Bit String
00000100100 0001101101
0000110100
1
1010000100
0000000100 0000010110
0000001110
2
0110100000
0010100000 0011100010
0010101000
30000100000 1000100000
0000110001
13 / 30
![Page 54: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/54.jpg)
{@POTUS,energy,policy}{@Chomsky,education}{@ggreenwald,NSA}⋆
.
.
.
input queries (stream)
q1= 010101 · · ·11
q2= 011111 · · ·01
q⋆
3= 001110 · · ·11
.
.
.
Bloom-filterencoding
⋆ “unique” query
pre
-pro
cess
CPU
0 none
1 010001 · · ·01 → P1
2001100 · · ·00 → P2001010 · · ·11 → P3001011 · · ·01 → P4
3000101 · · ·10 → P5
. . .
· · · · · ·191 . . .
partition table
su
bset
matc
h
GPU
P1
011011 · · ·01 ↔ 1010101 · · ·11 ↔ 2010101 · · ·01 ↔ 3
. . .
P2
001101 · · ·10 ↔ 62001101 · · ·01 ↔ 63001100 · · ·11 ↔ 64
. . .
.
.
.
.
.
.
tagset table
. . . ,q2
batch1 P1
. . . ,q2 ,q3
batch2 P2
. . . ,q1 ,q3
batch3 P3
.
.
.
key
loo
ku
p/r
ed
uce
CPU
1 → k1 ,k23 → k2 ,k6 ,k8
.
.
.
63 → k5 ,k8 ,k13
.
.
.
key table
q2 ,1,q2 ,3, . . .
results1
q2 ,63,q3 ,71, . . .
results2
q1 ,324,q3 ,99, . . .
results3
.
.
.
q1 →k3 ,k13 , . . .
q2 →k1 ,k2 ,k2 ,
k6 ,k8 ,k5 ,
k8 ,k13 , . . .
q⋆
3 →k9 ,k3 ,k37 ,
k3 ,k7 , . . .
.
.
.
results (stream)
merge
CPU
Pre-process
14 / 30
![Page 55: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/55.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
![Page 56: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/56.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q0
q0
q0
![Page 57: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/57.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q1
q1
q1
q1
q0 q1
![Page 58: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/58.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q2
q2
q2
q1
q2
q0 q1 q2
15 / 30
![Page 59: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/59.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q1
q2
q0 q1 q2
flush
15 / 30
![Page 60: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/60.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q1
q2
Timeout expired!
15 / 30
![Page 61: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/61.jpg)
Pre Process
front
end
1st bit Mask...
.
.
.
2 0010100000 → P2
40000100100 → P0
0000100000 → P3
7 0000000100 → P1
.
.
....
thread poolfooooo
partition
queues
P0
P1
P2
P3
Pn
GPU
handlers
GPUscheduler
q1
q2
flush
15 / 30
![Page 62: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/62.jpg)
Optimization
16 / 30
![Page 63: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/63.jpg)
GPU Optimization
q0 q1 q2 q3 q4 . . . q255Kernel
Block 0 Block 1 Block 2
Block 3 Block 4 Block 5
Block 6 Block . . . Block n
tagset
table
s0
s1
s2
.
.
.
.
.
.
sn−2
sn−1
sn
17 / 30
![Page 64: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/64.jpg)
GPU Optimization
Kernel q0 q1 q2 q3 q4 . . . q255
Block 0
t255 | 1110010100
. . . | . . .
t2 | 1110100000
t1 | 1110110000
t0 | 1110110110
Block 1
t255 | 0011101101
. . . | . . .
t2 | 0101101011
t1 | 0110001110
t0 | 0110010110
17 / 30
![Page 65: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/65.jpg)
GPU OptimizationPhase 1
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
idle
Thread 1
idle
Thread n
idle
Thread 2
idlefirst = 1110110110
last = 1110010100
17 /
![Page 66: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/66.jpg)
GPU OptimizationPhase 1
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
idle
Thread 1
idle
Thread n
idle
Thread 2
idlefirst = 1110110110
last = 1110010100
first ⊕ last = 0000100010
17 /
![Page 67: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/67.jpg)
GPU OptimizationPhase 1
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
idle
Thread 1
idle
Thread n
idle
Thread 2
idlefirst = 1110110110
last = 1110010100
first ⊕ last = 0000100010
prefix = 1110000000
common prefix = 1110000000
17 / 30
![Page 68: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/68.jpg)
GPU OptimizationPhase 2
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
prefix ⊆ q3?
Thread 1
prefix ⊆ q1?
Thread n
prefix ⊆ qn?
Thread 2
prefix ⊆ q2?
common prefix = 1110000000
prefix ⊆ q0?
Q =
17 / 30
![Page 69: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/69.jpg)
GPU OptimizationPhase 2
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
V
Thread 1
V
Thread n
?
Thread 2
X
common prefix = 1110000000
V
q1 q3 q21 q0 q200q177Q =
17 / 30
![Page 70: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/70.jpg)
GPU OptimizationPhase 3
Kernel q0 q1 q2 q3 q4 . . . q255
Block
Thread 0
Thread 3
for (qi ∈ Q)
if (f ⊆ qi )
results.add(qi )
Thread 1
for (qi ∈ Q)
if (f ⊆ qi )
results.add(qi )
Thread n
for (qi ∈ Q)
if (f ⊆ qi )
results.add(qi )
Thread 2
for (qi ∈ Q)
if (f ⊆ qi )
results.add(qi )
common prefix = 1110000000
for (qi ∈ Q)
if (f ⊆ qi )
results.add(qi )
q1 q3 q21 q0 q200q177Q =
17 / 30
![Page 71: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/71.jpg)
Workflow Optimization
18 / 30
![Page 72: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/72.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size Data
18 / 30
![Page 73: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/73.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size Data
copy res size
![Page 74: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/74.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size
3
Data
copy res size
syn
c
18 / 30
![Page 75: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/75.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size
3
Data
copy res size
syn
c
copy res data
18 / 30
![Page 76: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/76.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size
3
Data
copy res size
syn
c
copy res data
18 / 30
![Page 77: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/77.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size
3 q7,q21,q1
Data
copy res size
syn
c
copy res data
syn
c
18 / 30
![Page 78: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/78.jpg)
Workflow Optimization
run kernel
Size
3 q7,q21,q1
Data
GPU
CPU
Size
3 q7,q21,q1
Data
copy res size
syn
c
copy res data
syn
cprocess res
18 / 30
![Page 79: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/79.jpg)
Workflow Optimization
run kernel
copy all res
process ressyn
c
Size Data
GPU
CPU
Size Data
18 / 30
![Page 80: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/80.jpg)
Workflow Optimization
GPU
CPU
Size Data
Size Data
18 / 30
![Page 81: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/81.jpg)
Workflow Optimization
GPU
CPU
Size Data
q207,q17
Size Data
Size Data
Size
2
Data
![Page 82: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/82.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q207,q17
Size Data
Size Data
q7,q21,q1
Size
2
Data
run kernel
![Page 83: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/83.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q207,q17
Size Data
Size Data
q7,q21,q1
Size
2
Data
run kernel
copy res
![Page 84: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/84.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q207,q17
Size
3
Data
q207,q17
Size Data
q7,q21,q1
Size
2
Data
run kernel
copy res
syn
c
![Page 85: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/85.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q207,q17
Size
3
Data
q207,q17
Size Data
q7,q21,q1
Size
2
Data
run kernel
copy res
syn
c
process res
![Page 86: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/86.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q87,q12,q1,q5
Size
3
Data
q207,q17
Size
4
Data
q7,q21,q1
Size
2
Data
run kernel
copy res
syn
c
process res
run kernel
![Page 87: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/87.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q87,q12,q1,q5
Size
3
Data
q207,q17
Size
4
Data
q7,q21,q1
Size
2
Data
run kernel
copy res
syn
c
process res
run kernel
copy res
![Page 88: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/88.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q87,q12,q1,q5
Size
3
Data
q207,q17
Size
4
Data
q7,q21,q1
Size
4
Data
q7,q21,q1
run kernel
copy res
syn
c
process res
run kernel
copy ressyn
c
![Page 89: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/89.jpg)
Workflow Optimization
GPU
CPU
Size
3
Data
q87,q12,q1,q5
Size
3
Data
q207,q17
Size
4
Data
q7,q21,q1
Size
4
Data
q7,q21,q1
run kernel
copy res
syn
c
process res
run kernel
copy ressyn
c
process res18 / 30
![Page 90: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/90.jpg)
Workflow Optimization
run kernel
copy res size
copy res data
process res
syn
csyn
c
run kernel
copy all res
process res
syn
c
run kernel
copy res
process res
run kernel
copy res
process res
syn
csyn
c
18 / 30
![Page 91: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/91.jpg)
Evaluation
19 / 30
![Page 92: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/92.jpg)
Evaluation
1 single machine
24 (48) physical (virtual) cpu cores
2 Nvidia Titan X
19 / 30
![Page 93: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/93.jpg)
Scalability
1
10
100
20 30 40 50 60 70 80 90 100
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Database size (% of the full Twitter database)
TagMatch, matchTagMatch, match-unique
Does it scale with bigger databases?
20 / 30
![Page 94: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/94.jpg)
Scalability
1
10
100
20 30 40 50 60 70 80 90 100
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Database size (% of the full Twitter database)
TagMatch, matchTagMatch, match-unique
20 / 30
![Page 95: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/95.jpg)
Scalability
1
10
100
20 30 40 50 60 70 80 90 100
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Database size (% of the full Twitter database)
TagMatch, matchTagMatch, match-uniqueprefix tree, matchprefix tree, match-unique
20 / 30
![Page 96: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/96.jpg)
Scalability
1
10
100
20 30 40 50 60 70 80 90 100
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Database size (% of the full Twitter database)
TagMatch, matchTagMatch, match-uniqueprefix tree, matchprefix tree, match-unique
20 / 30
![Page 97: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/97.jpg)
Threads
0
10
20
30
40
50
8 16 24 32 40 48
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Number of threads
TagMatch, matchTagMatch, match-unique
prefix tree, matchprefix tree, match-unique
Does it scale with bigger machines?
21 / 30
![Page 98: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/98.jpg)
Threads
0
10
20
30
40
50
8 16 24 32 40 48
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Number of threads
TagMatch, matchTagMatch, match-unique
prefix tree, matchprefix tree, match-unique
21 / 30
![Page 99: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/99.jpg)
Threads
0
10
20
30
40
50
8 16 24 32 40 48
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Number of threads
TagMatch, matchTagMatch, match-unique
prefix tree, matchprefix tree, match-unique
GPU limit!
21 / 30
![Page 100: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/100.jpg)
Latency
0
0.5
1
1.5
2
2.5
3
3.5
4
200 400 600 800 no limit
Late
ncy
(s)
Timeout (ms)
1%, 25%, median, 75%, 99%maximum
Does batching kill latency?
22 / 30
![Page 101: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/101.jpg)
Latency
0
0.5
1
1.5
2
2.5
3
3.5
4
200 400 600 800 no limit
Late
ncy
(s)
Timeout (ms)
1%, 25%, median, 75%, 99%maximum
22 / 30
![Page 102: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/102.jpg)
Memory usage
5
10
15
20
25
30
0 20 40 60 80 100
Mem
ory
usag
e(G
B)
Database size (% of the full Twitter database)
GPU, I/O buffersGPU, tagset table
Host
How much memory does it need?
23 / 30
![Page 103: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/103.jpg)
Memory usage
5
10
15
20
25
30
0 20 40 60 80 100
Mem
ory
usag
e(G
B)
Database size (% of the full Twitter database)
GPU, I/O buffersGPU, tagset table
Host
23 / 30
![Page 104: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/104.jpg)
Conclusion
subset matching
24 / 30
![Page 105: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/105.jpg)
Conclusion
subset matching◮ computationally complex◮ highly parallelizable
24 / 30
![Page 106: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/106.jpg)
Conclusion
subset matching◮ computationally complex◮ highly parallelizable
TagMatch
24 / 30
![Page 107: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/107.jpg)
Conclusion
subset matching◮ computationally complex◮ highly parallelizable
TagMatch◮ implements an efficient CPU/GPU pipeline
24 / 30
![Page 108: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/108.jpg)
Conclusion
subset matching◮ computationally complex◮ highly parallelizable
TagMatch◮ implements an efficient CPU/GPU pipeline
https://github.com/carzaniga/TagMatch
24 / 30
![Page 109: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/109.jpg)
High-Throughput Subset Matching on
Commodity GPU-Based Systems
Daniele Rogora∗ Michele Papalini$ Koorosh Khazaei∗
Alessandro Margara% Antonio Carzaniga∗ Gianpaolo Cugola%
presented by
Daniele Rogora
%Politecnico di Milano ∗Università della Svizzera italiana $Cisco Systems
Milano Lugano Paris
Italy Switzerland France
EuroSys 2017
25 / 30
![Page 110: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/110.jpg)
Partition size
0
5
10
15
20
25
30
35
40
0 100 200 300 400 500 600 700 800 900
Thr
ough
put
(tho
usan
d qu
erie
s/s)
MAXP: Maximum size of partitions (thousands)
matchmatch-unique
26 / 30
![Page 111: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/111.jpg)
Mongo DB
10-1
100
101
102
103
104
105
106
4 5 6 7 8 9 10
Thr
ough
put
(que
ries/
s)
Number of tags per query
TagMatch 1MTagMatch 3MTagMatch 5M
MongoDB 1MMongoDB 3MMongoDB 5M
27 / 30
![Page 112: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/112.jpg)
Partitioning time
0
10
20
30
40
50
10 20 30 40 50 60 70 80 90 100
Tim
e (s
)
Database size (% of the full Twitter database)
balanced partitioning
28 / 30
![Page 113: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/113.jpg)
More tags
0.1
1
10
100
1000
0 1 2 3 4 5 6 7 8 9
Thr
ough
put
(tho
usan
d qu
erie
s/s)
Number of additional tags per query
TagMatchprefix tree
100
1000
10000
100000
0 1 2 3 4 5 6 7 8 9
Out
put t
hrou
ghpu
t(t
hous
and
keys
/s)
Number of additional tags per query
TagMatchprefix tree
29 / 30
![Page 114: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/114.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 115: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/115.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 116: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/116.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 117: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/117.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
h1
h2
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 118: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/118.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 119: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/119.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30
![Page 120: High-Throughput Subset Matching on Commodity GPU-Based … · Subset Match Useful in many scenarios Social networks, Twitter Data Center management Service brokering 2/30](https://reader033.vdocuments.us/reader033/viewer/2022050206/5f58f7a83bc36368ed6d3967/html5/thumbnails/120.jpg)
Descriptors Representation
Representation of tagsets with Bloom filters
a bitvector of size m
k independent hash functions h1, . . . ,hk
hi : Tags →{1, . . . ,m}
Example: (k = 2,m = 10)
1 2 3 4 5 6 7 8 9 10
D = {politics, Italy, USA} 1 111 1
Concretely, in our implementation: m = 192,k = 7
False positives: testing S1 ⊆ S2 with Bloom fil-
ters gives a false positive with probability 1 −
e−k |S2|mk |S1\S2|
For example, when |S2| = 10 and |S1 \S2| = 3, we
have a false positive with probability 10−11
30 / 30