![Page 1: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/1.jpg)
Parallel DBs
![Page 2: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/2.jpg)
Why Scale?Scan of 1 PB at 300MB/s (SATA r2 Limit)
![Page 3: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/3.jpg)
Why Scale Up?Scan of 1 PB at 300MB/s (SATA r2 Limit)
~1 Hour
![Page 4: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/4.jpg)
Why Scale Up?Scan of 1 PB at 300MB/s (SATA r2 Limit)
~1 Hour
…(x1000)
~3.5 Seconds
![Page 5: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/5.jpg)
Data Parallelism
A A A CBA
Replication Partitioning
![Page 6: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/6.jpg)
Operator Parallelism• Pipeline Parallelism: A task breaks down into
stages; each machine processes one stage.
• Partition Parallelism: Many machines doing the same thing to different pieces of data.
SequentialOperation
SequentialOperation
SequentialOperation
SequentialOperationSequential
OperationSequentialOperation
![Page 7: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/7.jpg)
Types of Parallelism
• Both types of parallelism are natural in a database management system.
SequentialOperationSequential
OperationSequentialOperation
SequentialOperationSequential
OperationSequentialOperation
SequentialOperationSequential
OperationSequentialOperation
SELECT SUM(…) FROM Table WHERE …
LOAD SELECT AGG Combine
SequentialOperation
![Page 8: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/8.jpg)
DBMSes: The First || Success Story
• Every major DBMS vendor has a || version.
• Reasons for success:
• Bulk Processing (Partition ||-ism).
• Natural Pipelining in RA plan.
• Users don’t need to think in ||.
![Page 9: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/9.jpg)
Types of Speedup
• Speed-up ||-ism
• More resources = proportionally less time spent.
• Scale-up ||-ism
• More resources = proportionally more data processed.
# of Nodes
Resp
on
se T
ime
# of Nodes
Thro
ug
hp
ut
![Page 10: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/10.jpg)
Parallelism Models
CPU
Memory
Disk
![Page 11: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/11.jpg)
Parallelism Models
CPU
Memory
Disk
…
How do the nodes communicate?
![Page 12: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/12.jpg)
Parallelism Models
CPU
Memory
Disk
…
Option 1: “Shared Memory” available to all CPUs
e.g., a Multi-Core/Multi-CPU System
![Page 13: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/13.jpg)
Parallelism Models
CPU
Memory
Disk
…
Used by most AMD servers
Option 2: Non-Uniform Memory Access.
![Page 14: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/14.jpg)
Parallelism Models
CPU
Memory
Disk
…
Each node interacts with a “disk” on the network.
Option 3: “Shared Disk” available to all CPUs
![Page 15: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/15.jpg)
Parallelism Models
CPU
Memory
Disk
…
Examples include MPP, Map/Reduce. Often used as basis for other abstractions.
Option 4: “Shared Nothing” in which all communication is explicit.
![Page 16: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/16.jpg)
Parallelizing
OLAP - Parallel Queries
OLTP - Parallel Updates
![Page 17: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/17.jpg)
Parallelizing
OLAP - Parallel Queries
OLTP - Parallel Updates
![Page 18: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/18.jpg)
Parallelism & Distribution
• Distribute the Data
• Redundancy
• Faster access
• Parallelize the Computation
• Scale up (compute faster)
• Scale out (bigger data)
![Page 19: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/19.jpg)
Operator Parallelism
• General Concept: Break task into individual units of computation.
• Challenge: How much data does each unit of computation need?
• Challenge: How much data transfer is needed to allow the unit of computation?
Same challenges arise in Multicore, CUDA programming.
![Page 20: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/20.jpg)
Parallel Data Flow
AA
No Parallelism
AA
![Page 21: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/21.jpg)
Parallel Data Flow
AA AA11 NN
N-Way Parallelism
![Page 22: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/22.jpg)
Parallel Data Flow
AA AA11 NN
BB BB11 NN
Chaining Parallel Operators
???
![Page 23: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/23.jpg)
Parallel Data Flow
AA AA11 NN
BB BB11 NN
One-to-One Data Flow (“Map”)
![Page 24: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/24.jpg)
Parallel Data Flow
AA AA11 NN
BB BB11 NN
One-to-One Data Flow
![Page 25: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/25.jpg)
Parallel Data Flow
AA AA11 NN
BB BB11 NN
Many-to-Many Data Flow
Extreme 1All-to-All
All nodes sendall records to
all downstreamnodes
Extreme 2Partition
Each recordgoes to exactly
one downstreamnode
![Page 26: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/26.jpg)
Parallel Data Flow
AA AA11 NN
BBBB
Many-to-One Data Flow (“Reduce/Fold”)
![Page 27: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/27.jpg)
Parallel Operators
Select Project Union (bag)
What is a logical “unit of computation”?
Is there a data dependency between units?
(1 tuple)
(no)
![Page 28: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/28.jpg)
Parallel Operators
Select Project Union (bag)
AA AA11 NN
1/N Tuples 1/N Tuples
![Page 29: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/29.jpg)
Parallel Aggregates
Algebraic: Bounded-size intermediate state(Sum, Count, Avg, Min, Max)
Holistic: Unbounded-size intermediate state(Median, Mode/Top-K Count, Count-Distinct;
Not Distribution-Friendly)
![Page 30: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/30.jpg)
AA
Fan-In Aggregation
AA11 NN
BBSUMSUM
![Page 31: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/31.jpg)
Fan-In Aggregation
AA11 AA22 AA33 AA44 AA55 AA66 AA77 AA88
SUMSUM 8 Messages
![Page 32: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/32.jpg)
Fan-In Aggregation
AA11 AA22 AA33 AA44 AA55 AA66 AA77 AA88
SUMSUM 4 Messages
SUMSUM11 SUMSUM
22 SUMSUM33 SUMSUM
44
2 Messages(each)
![Page 33: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/33.jpg)
Fan-In Aggregation
AA11 AA22 AA33 AA44 AA55 AA66 AA77 AA88
SUMSUM 2 Messages
SUMSUM11 SUMSUM
22 SUMSUM33 SUMSUM
44
2 Messages(each)
SUM’SUM’11 SUM’SUM’
22
![Page 34: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/34.jpg)
Fan-In Aggregation
If Each Node Performs K Units of Work…(K Messages)
How Many Rounds of Computation Are Needed?
LogK(N)
![Page 35: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/35.jpg)
Fan-In AggregationComponents
Combine(Intermediate1, …, IntermediateN)= Intermediate
<SUM1, COUNT1> … <SUM⊗ ⊗ N, COUNTN> = <SUM1+…+SUMN, COUNT1+…+COUNTN>
Compute(Intermediate) = Aggregate
Compute(<SUM, COUNT>) = SUM / COUNT
![Page 36: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/36.jpg)
Parallel Joins
FOR i IN 1 to N FOR j IN 1 to K JOIN(Block i of R, Block j of S)
One Unit of Computation
PartitionPartition
![Page 37: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/37.jpg)
Parallel Joins
Block 1 of R⋈
Block 1 of S
N P
art
itio
ns
of
R
K Partitions of SBlock 1 of R
⋈Block K of S
Block N of R⋈
Block K of S
Block N of R⋈
Block 1 of S
K
K
N N
![Page 38: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/38.jpg)
1⋈11⋈1
Parallel Joins
2⋈12⋈1 N⋈KN⋈K
R[1] R[2] R[N]… S[1] S[2] S[K]…
UNION
![Page 39: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/39.jpg)
Parallel Joins
How much data needs to be transferred?
How many “units of computation” do we create?
![Page 40: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/40.jpg)
Parallel Joins
What if we partitioned “intelligently”?
![Page 41: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/41.jpg)
Parallel JoinsHash(R.B)%4
0
1
2
3
√ √ √ √
R ⋈B S: Which Partitions of S Join w/ Bucket 0 of R?
Hash
(S.B
)%4
0 1 2 3
X X X
√
√
√
![Page 42: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/42.jpg)
Parallel JoinsR.BB<25
25≤B<50
50≤B<75
75≤B
R ⋈R.B < S.B S: Which Partitions of S Can Produce Output?
S.B
B<25 25≤B<50 50≤B<75 75≤B
√
√
√
√ √ √
√ √
√
√
X
X X
XXX
![Page 43: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/43.jpg)
Distributing the Work
S
⋈B
R
Let’s start simple… what can we do with no partitions?
R and S may be any RA expression…
![Page 44: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/44.jpg)
Distributing the Work
S
⋈B
RNode 1
No Parallelism!
![Page 45: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/45.jpg)
Distributing the Work
S
⋈B
RNode 2Node 1
Node 3
Lots of Data Transfer!
All of Rand
All of Sget sent!
![Page 46: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/46.jpg)
Distributing the Work
S
⋈B
RNode 2Node 1
All of Rget sent
Better! We can guess whether R or S is smaller.
![Page 47: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/47.jpg)
Distributing the WorkWhat can we do if R is partitioned?
R2
⋈B
SR1
⋈B
U
![Page 48: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/48.jpg)
Distributing the WorkThere are lots of partitioning strategies, but this one is interesting….
R2
⋈B
SR1
⋈B
U
Node 2 Node 3Node 1
![Page 49: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/49.jpg)
R2
⋈B
S1R1
⋈B
U
Distributing the Work… it can be used as a model for partitioning S…
Node 2 Node 3Node 1
![Page 50: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/50.jpg)
R2
⋈B
S2R1
⋈B
U
Distributing the Work… it can be used as a model for partitioning S…
Node 2 Node 3Node 1
![Page 51: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/51.jpg)
R2
⋈B
SR1
⋈B
U
Distributing the Work…and neatly captures the data transfer issue.
Node 2 Node 3Node 1
![Page 52: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/52.jpg)
Distributing the Work
Si joins with R1,R2,…,RN locally.
So let’s use it:
Goal: Minimize amount of data sent from Rk to Si
Solution 1: Use a partitioning strategy
Solution 2: “Hints” to figure out what Rk should send
![Page 53: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/53.jpg)
Sending Hints
Node 1 Node 2
Rk Si
Rk ⋈B SiThe naive approach…
![Page 54: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/54.jpg)
Sending Hints
Node 1 Node 2
Rk Si
Rk ⋈B SiThe naive approach…
Send me Rk
![Page 55: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/55.jpg)
Sending Hints
Node 1 Node 2
Rk Si
Rk ⋈B SiThe naive approach…
Rk
![Page 56: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/56.jpg)
Sending Hints
Node 1 Node 2
Rk Si
Rk ⋈B SiThe smarter approach…
πB( ) Si
![Page 57: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/57.jpg)
Sending Hints
Node 1 Node 2
Rk Si
Rk ⋈B SiThe smarter approach…
πB( ) Si
⋈ πB( ) Rk Si
![Page 58: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/58.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B SiThe smarter approach…
<1,A><2,B><2,C><3,D><4,E>
<2,X><3,Y><6,Y>
![Page 59: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/59.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B SiThe smarter approach…
<2,X>Send me rows with a ‘B’ of 2,3, or 6
<3,Y><1,A><2,B><2,C><3,D><4,E>
<6,Y>
Send me rows with a ‘B’ of 2,3, or 6
![Page 60: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/60.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B SiThe smarter approach…
<1,A><2,B><2,C><3,D>
<2,X><3,Y>
<4,E> This is called a semi-join.<6,Y>
<2,B><2,C><3,D>
Send me rows with a ‘B’ of 2,3, or 6
Send me rows with a ‘B’ of 2,3, or 6
![Page 61: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/61.jpg)
Sending Hints
Now Node 1 sends as little data as possible…
… but Node 2 needs to send a lot of data.
Can we do better?
![Page 62: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/62.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 1: Parity Bits
100
01
0<6,Y>0
![Page 63: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/63.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 1: Parity Bits
100
01
0<6,Y>0
Send me data with a parity bit of ‘0’
![Page 64: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/64.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 1: Parity Bit
100
01
0
Node 1 sending too much is ok!(Node 2 still needs to compute ⋈B)
<6,Y>0
Problem: One parity bit is too little
Send me data with a parity bit of ‘0’
<2,B><2,C><4,E>
![Page 65: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/65.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 2: Parity Bits
011010
0011
1011<3,Y>
<6,Y>10
Problem: Almost as much data as πB
<2,B><2,C><3,D>
Send me data with parity bits 10 or 11
![Page 66: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/66.jpg)
Sending Hints
Can we summarize the parity bits?
![Page 67: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/67.jpg)
Bloom Filters
AliceBob
CarolDave
![Page 68: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/68.jpg)
Bloom Filters
BloomFilter
BloomFilter
AliceBob
CarolDave
![Page 69: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/69.jpg)
Bloom Filters
BloomFilter
BloomFilter
AliceBob
CarolDave
Is Alice part of the set? Is Alice part of the set?
Is Eve part of the set? Is Eve part of the set?
Is Fred part of the set? Is Fred part of the set?
YesYes
NoNo
YesYesBloom Filter GuaranteeTest definitely returns Yes if the element is in the set
Test usually returns No if the element is not in the set
![Page 70: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/70.jpg)
Bloom Filters
A Bloom Filter is a bit vector
M - # of bits in the bit vector
K - # of hash functions
For ONE key (or record): For i between 0 and K: bitvector[ hashi (key) % M ] = 1
Each bit vector has ~K bits set
![Page 71: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/71.jpg)
Bloom Filters
00101010
01010110
10000110
01001100
Key 1
Key 2
Key 3
Key 4
Filters are combined by Bitwise-OR
e.g. (Key 1 | Key 2)
= 01111110
How do we test for inclusion?(Key & Filter) == Key?
(Key 1 & S) = 00101010(Key 3 & S) = 00000110(Key 4 & S) = 01001100
X√
False Positive√
![Page 72: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/72.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 3: Bloom Filters
<3,Y><6,Y>
![Page 73: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/73.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 3: Bloom Filters
<3,Y><6,Y>
Send me rows with a ‘B’ in the bloom filtersummarizing the set {2,3,6}
![Page 74: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/74.jpg)
Sending Hints
Node 1 Node 2
Rk ⋈B Si
<1,A><2,B><2,C><3,D>
<2,X>
<4,E>
Strategy 3: Bloom Filters
<3,Y><6,Y>
This is called a bloom-join.
Send me rows with a ‘B’ in the bloom filtersummarizing the set {2,3,6}
<2,B><2,C><3,D><4,E>
![Page 75: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/75.jpg)
Bloom Filters
Probability that 1 bit is set by 1 hash fn
1/m
![Page 76: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/76.jpg)
Bloom Filters
Probability that 1 bit is not set by 1 hash fn
1/m1 -
![Page 77: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/77.jpg)
Bloom Filters
Probability that 1 bit is not set by k hash fns
1/m1 -( )k
![Page 78: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/78.jpg)
Bloom FiltersProbability that 1 bit is not set by k hash fns
for n records
1/m1 -( )kn
So for an arbitrary record, what is the probabilitythat all of its bits will be set?
![Page 79: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/79.jpg)
Bloom FiltersProbability that 1 bit is set by k hash fns
for n records
1/m1 -( )kn1 -
![Page 80: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/80.jpg)
Bloom FiltersProbability that all k bits are set by k hash fns
for n records
1/m1 -( )kn1 -( )k≈
-kn/m(1- e )≈ k
![Page 81: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/81.jpg)
Bloom FiltersMinimal P[collision]
m/n = 10
m/n = 5
m/n = 20 m/n = 30
Minimal P[collision] is at k ≈ c ∙ m/n
![Page 82: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/82.jpg)
Bloom Filters
k ≈ c ∙ m/n
≈ cn
m is linearly related to n (for a fixed k)
km
![Page 83: Parallel DBs - University at BuffaloParallel DBs. Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ... • Natural Pipelining](https://reader030.vdocuments.us/reader030/viewer/2022041021/5ed1004cec13a664af27d55d/html5/thumbnails/83.jpg)
Bloom Join• Node 2 Computes Bloom Filter for Local
Records
• Node 2 Sends Bloom Filter to Node 1
• Node 1 Matches Local Records Against Bloom Filter
• Node 1 Sends Matched Records to Node 2
• Superset of “useful” records
• Node 2 Performs Join Locally