gpu-accelerated analytics on your data lake....query 1 query 2 query 3 query 4 query 5 queries nds 1...
TRANSCRIPT
![Page 1: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/1.jpg)
GPU-Accelerated Analytics on your Data Lake.
![Page 2: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/2.jpg)
Data Lake
@blazingdb
![Page 3: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/3.jpg)
Data Swamp
@blazingdb
![Page 4: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/4.jpg)
ETL Hell
@blazingdb
DATA LAKE0001010100001001011010110
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
0101010100100101010101100001
0101101010010001011010100001
01010110100001
0101010100100101010101100001
0101101010010001011010100001
01010110100001
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>
>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>
>>>>>
>>>>>
>>>>
>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
![Page 5: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/5.jpg)
COMMON
@blazingdb
DATALAYER
![Page 6: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/6.jpg)
Simplify Data Storage
@blazingdb
SCHEMA
METADATA
DATA
![Page 7: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/7.jpg)
SQL Warehouse on Data Lake
@blazingdb
![Page 8: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/8.jpg)
BlazingDB – How it works
@blazingdb
• Compression/Decompression
• Filtering (Predicate Pushdown)
• Aggregations
• Transformations
• Joins
• Sorting/OrderingDATA LAKE0001010100001001011010110
• RAM Cache (Hot)
• Disk Cache (Medium)
• HDD
• SSDLocal DiskHDFS
AWS S3
![Page 9: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/9.jpg)
BlazingDB Multi-nodal Cluster
@blazingdb
![Page 10: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/10.jpg)
Shared Data Architecture
@blazingdb
DATA LAKE0001010100001001011010110
![Page 11: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/11.jpg)
The Nays
@blazingdb
No Vendor
Lock-in
No Consistency
Management
No BlazingDB
Specific ETL
No DuplicationNo Ingest
![Page 12: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/12.jpg)
The Yays
@blazingdb
High
Concurrency
Data Sharing
(Across Clusters
And Other Tools)
Multi-Terabyte
Queries
Scalable,
On Demand
Data Warehouse
Incredibly
Fast SQL
![Page 13: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/13.jpg)
@blazingdb
DEMO
![Page 14: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/14.jpg)
@blazingdb
Demo - ArchitectureHDFS on Azure Azure GPU Servers
NC24 V1• 4 Servers
![Page 15: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/15.jpg)
Queries: BlazingDB 4 Node Query times (Lower is better)
@blazingdb
Cold
Medium
(Disk cache only)
Hot
Query 1 Query 2 Query 3 Query 4 Query 5
QUERIES
SE
CO
ND
S
142.1
281.1
380.5
135.5
46
73.6
154.1
251.8
73.8
46.3
72
63.1
14 12.214.9
![Page 16: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/16.jpg)
Query 1
@blazingdb
Query 1
SE
CO
ND
S
Cold Medium(Disk cache only)
Hot
select l_returnflag, l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendeprice) as sum_disc_price,
sum(l_extendeprice*(1-l_discount)) as
sum_base_price,
sum(l_extendeprice*(1-l_discount)*(1+l_tax)) as
sum_charge,
avg(l_quatity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(l_quantity) as count_order
from lineitem
where l_shipdate <= ‘1995-06-01’
group by l_returnflag, l_linestatus
order by l_returnflag, l_linestatus;
1234
5
6789
10111213
Query1
Data Points• 6 billion row table
• Many aggregations/transformations
![Page 17: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/17.jpg)
Query 2
@blazingdb
Query 2
SE
CO
ND
S
Cold Medium(Disk cache only)
Hot
select lineitem.l_orderkey,
sum(lineitem.l_extendedprice*(1-
lineitem.l_discount)) as revenue,
orders.o_orderdate, orders.o_shippriority
from customer
inner join orders on customer.c_custkey =
orders.o_custkey inner join lineitem on
lineitem.l_orderkey = orders.o_orderkey
where
customer.c_mktsegment = 'BUILDING'
and orders.o_orderdate < '1995-03-15'
and lineitem.l_shipdate > '1995-03-15'
group by lineitem.l_orderkey,
orders.o_orderdate, orders.o_shippriority
order by revenue desc,orders.o_orderdate;
1234
5
6789
10111213
Query2
Data Points• Join 6B rows to 1.5B rows to 150M rows
• Many aggregations/transformations
• Order (sorting)
![Page 18: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/18.jpg)
Query 3
@blazingdb
Query 3
SE
CO
ND
S
Cold Medium(Disk cache only)
Hot
select nation.name, sum(lineitem.l_extendedprice *
(1 - lineitem.l_discount)) as revenue
from customer
inner join orders on customer.cust_key =
orders.o_custkey inner join lineitem on
lineitem.l_orderkey = orders.o_orderkey
inner join supplier on lineitem.l_suppkey =
supplier.s_suppkey inner join nation on
supplier.s_nationkey = nation.nation_key
inner join region on nation.region_key =
region.r_regionkey
where supplier.s_nationkey = nation.nation_key
and region.r_name = 'ASIA'
and orders.o_orderdate >= '19940101'
and orders.o_orderdate < '19950101'
group by nation.name order by revenue desc
1234
5
6789
1011121314
Query3
Data Points• Join 6B rows to 1.5B rows to 150M rows (and many
small joins)
• Multiple aggregations/transformations
• Order (sorting)
![Page 19: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/19.jpg)
Query 4
@blazingdb
Query 4
SE
CO
ND
S
Cold Medium(Disk cache only)
Hot
select sum(l_extendedprice) as sum_exprice,
sum(l_discount) as sum_discount
from lineitem
where l_shipdate >= '19940101'
and l_shipdate < '19950101'
and l_discount >= 0.05 and l_discount <= 0.07
and l_quantity < 24
1234
5
6789
1011121314
Query4
Data Points• 6B row table
• Multiple aggregations/transformations
![Page 20: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/20.jpg)
Query 5
@blazingdb
Query 5
SE
CO
ND
S
Cold Medium(Disk cache only)
Hot
select supplier.s_acctbal, supplier.s_suppkey, nation.name,
part.p_partkey, part.p_mfgr, supplier.s_address, supplier.s_phone,
supplier.s_comment
from supplier
inner join partsupp on supplier.s_suppkey = partsupp.ps_suppkey
inner join nation on supplier.s_nationkey = nation.nation_key
inner join region on nation.region_key = region.r_regionkey
inner join part on part.p_partkey = partsupp.ps_partkey
where part.p_size = 15
and part.p_type in ('ECONOMY ANODIZED BRASS', 'ECONOMY BRUSHED BRASS',
'ECONOMY BURNISHED BRASS', 'ECONOMY PLATED BRASS', 'ECONOMY POLISHED
BRASS', 'LARGE ANODIZED BRASS',
LARGE BRUSHED BRASS','LARGE BURNISHED BRASS','LARGE PLATED BRASS',
'LARGE POLISHED BRASS', 'SMALL ANODIZED BRASS', 'SMALL BRUSHED BRASS',
'SMALL BURNISHED BRASS',
SMALL PLATED BRASS', 'SMALL POLISHED BRASS', 'STANDARD ANODIZED
BRASS', 'STANDARD BRUSHED BRASS', 'STANDARD BURNISHED BRASS',
'STANDARD PLATED BRASS', 'STANDARD POLISHED BRASS')
and region.r_name = 'EUROPE'
order by supplier.s_acctbal desc, supplier.s_suppkey, nation.name,
part.p_partkey
Query1
Data Points• Join multiple tables
• Many aggregations/transformations
• String comparisons
![Page 21: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/21.jpg)
@blazingdb
Data Pipeline
GPU Data Frame
Apache Arrow
CommonData Layer
INGEST
STORAGE(Data Lake)
Coming Soon
![Page 22: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/22.jpg)
@blazingdb
Questions?
![Page 23: GPU-Accelerated Analytics on your Data Lake....Query 1 Query 2 Query 3 Query 4 Query 5 QUERIES NDS 1 1 5 5 46 6 1 8 8 3 72 1 14 14.9 12.2. Query 1 @blazingdb Query 1 NDS ... • String](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fc3569b99333974250e9fb2/html5/thumbnails/23.jpg)