Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Accelerating Apache Arrow and Quartet FS ActivePivot with SPARC Software in Silicon CON6383
Allen Whipple Managing Director, New York Quartet FS Amir Javanshir Principal Software Engineer ISV Engineering, Oracle Sanjay Rao Senior Software Engineer ISV Engineering, Oracle
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Data Analytics Accelerator (DAX): Overview and OpenDAX API
JDK 8 with DAX
Quartet FS: Return on experience
Apache Arrow
Getting started
1
2
3
4
4
5
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
The rise of In-Memory Computing
• Data analytics is increasingly relying on in-memory approaches to deliver better time to insight
• The key to delivering performance for data intensive, in-memory analytics is streaming and processing as much data as possible from memory
• Accommodating as much data in memory is essential
• Oracle's DAX technology developed into SPARC processors provides almost an order of magnitude speed-up for this type of analytics work on compressed data
• We will show you how to use this technology to achieve insight faster, using less hardware
5
Making Analytics available to all
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Introducing Accelerators into Oracle’s SPARC Processors
• Break-through Chip design: – 2X-3X Memory Bandwidth
– 32 cores and 256 threads at 4.13Ghz
– Large 64MB L3 Cache
– Up to 2TB physical memory per processor
– 3x IO Bandwidth over prior generations
• Data Analytics Acceleration (DAX) – Offloads processing for lower core usage
– DB12c Query Acceleration
– Open APIs for Customers, Partners and Developers
– Early adopters seeing amazing results
Software in Silicon Technology
6
Memory
SPARC M7 & S7
Full Bandwidth
DAX SQL
CORES OFFLOADED CORES OFFLOADED
SQL DAX
Crypto Accelerators
Integrated Offload
6
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Analytics Accelerator Engine
Decompress
Unpack/ Alignment
Scan, Filter, Join
Result Format/ Encode
Data Input Queues
Local SRAM
Decompress
Unpack/ Alignment
Result Format/ Encode
Decompress
Unpack/ Alignment
Result Format/ Encode
Decompress
Unpack/ Alignment
Result Format/ Encode
Data Output Queues M7 Query Engine
On-Chip Network
Data Input Queues
Data Output Queues
On-Chip Network
On-Chip Network
On-Chip Network
Scan, Filter, Join
Scan, Filter, Join
Scan, Filter, Join
7
• Data Analytics Accelerators (DAX) built on chip
– Independently process streams of data placed in system memory
– Decompress Data simultaneously
– Frees cores to run other applications, such as OLTP
– Reduces cache pollution
– 8 (M7) or 4(S7) accelerator units per chip, with 4 pipelines each
– Profits to big data analysis, machine learning algorithms and Oracle DB In-Memory
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
M7 DAX Operations
8
• Scan - Scans an array for elements which match (or greater than or less than) an input value and returns a bit vector with bits set for match
• Select - Selects elements from an array based on a bit vector - Input: bit vector ; Output: elements for which the bit vector =1
• Translate - Inset operation, given an input set of integers, how many of them are present in also another set
• Extract - Decompression – Formats supported - RLE, N-gram compression etc
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
DAX Open APIs for Partners and Application Developers
• Oracle DB12c In-Memory Query Acceleration and Inline Decompression
– Available in Oracle Database 12.1.0.2 BP13 + Oracle Solaris 11.3
• Solaris 11.3 APIs – libdax APIs support hardware capabilities and hide details and handles limitations
– libvector APIs extends with JNI, Python, SQL bindings
• Implementing Java Streams with DAX acceleration
• Implementing DAX with Partners and FOSS (Apache Spark)
9
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Data Analytics Accelerator (DAX): Overview and OpenDAX API
JDK 8 with DAX
Quartet FS: Return on experience
Apache Arrow
Getting started
1
2
3
4
11
5
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Adding DAX to JDK 8 Lambda Streams
12
• The Lambda Streams API introduced in JDK 8 is a natural fit to expose DAX functionality to accelerate existing and future Java based analytics applications and frameworks
• Implemented as JDK8 standalone library
• Successfully offloads Integer Stream filter, allMatch, anyMatch, noneMatch and count functions to DAX
• Existing code offloaded “Under-the-Hood” by JVM, significant performance boost with dramatically lower compute resources
• Release as part of JDK (OpenJDK project)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 13
Automatic Acceleration of Java Analytics • Use cases: SQL style Java: e.g. Top N integers, outlier detection, cube building, KNN
algorithm, weather analysis
• Example results: Weather Data Query (using 10 Million data points)
– Query 1: Number of times Temperature crossed 90F ~ 10X faster on DAX
– Query 2: Test if Temperature always less than 100F ~ 20X faster on DAX
• Executed on SPARC S7-2 running Solaris 12 & JDK 8 Standalone Library
13
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Weather Query 2
Weather Query 1
Top-N Integers
Percentile Calculator
Outlier Detection
Exe
cuti
on
Tim
e (m
s)
No Offload
DAX Offload
Workload Baseline run time (ms)
Trinity run time (ms) Speedup
Weather Analysis Query 2
88 4 21.8X
Weather Analysis Query 1
129 12 10.8X
Top-N Integer 243 59 4.1X
Percentile Calculator 1860 468 4.0X
Outlier detection 683 188 3.6X
Predicate (> 95 F)
Map (filter)
Reduce (count)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Data Analytics Accelerator (DAX): Overview and OpenDAX API
JDK 8 with DAX
Quartet FS: Return on experience
Apache Arrow
Getting started
1
2
4
14
5
3
w w w . q u a r t e t f s . c o m 15
Solving the operational decision-making needs of business users working in time-sensitive and data-intensive environments
About Quartet FS
Established in 2005 by the founders of Summit Systems
6 offices - London - New York - Paris - Singapore - Hong Kong - Sydney
75+ employees 80+ implementations 50+ client organisations 10+ ISV & SaaS partners
w w w . q u a r t e t f s . c o m 16
PRE-PROCESSING
Data enrichment
Pre-calculations
Custom rules
AGGREGATION
Incremental updates
On the fly aggregation
POST-PROCESSING
Computes complex measures
Reacts to real-time streaming
Includes user specific behaviour
HETEROGENEOUS
DATA SOURCES
ActivePivot
Pure Java in-memory analytics database management system Aggregates data incrementally and in real-time Executes complex computations based on your business logic Supports on-demand “what-if” analysis on real-time data.
USER INTERFACE
An Open Aggregation and Calculation Framework
What-if analysis
Intuitive exploration
Alerts
w w w . q u a r t e t f s . c o m 17
Typical ActivePivot Analytics Use Cases
1
7
Finance Pricing Supply Chain
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
ActiveDax Prototype
• Written in C
• Simulates Quartet FS ActivePivot’s in memory computation engine
• Generates an in memory table based on a set of input parameters: X columns storing dictionary indexes (unsigned integers)
• Generates data randomly
• Runs N parallel threads simulating users
• Each user runs sequentially M queries
• Each query is randomly generated ( = value or != value)
18
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Scan in-memory table representing indexes Line Nb | Col 1 | Col 2 | Col 3 | Col 4 | Val 1 | Val 2 | Val 3 |
------------ | ------- | ------- | ------- | ------- | ------- | ------- | ------- |
1 | 3 | 6 | 162 | 32 | 10.59 | 15.30 | 0.26 |
2 | 6 | 5 | 69 | 28 | 0.76 | 13.69 | 0.99 |
3 | 5 | 3 | 32 | 31 | 55.61 | 16.56 | 0.42 |
4 | 3 | 2 | 187 | 22 | -28.01 | 12.71 | 0.82 |
5 | 1 | 0 | 60 | 30 | 66.98 | 8.29 | 0.17 |
6 | 9 | 1 | 28 | 43 | 51.67 | 9.66 | 0.86 |
7 | 0 | 3 | 128 | 13 | 2.71 | 9.12 | 0.93 |
8 | 1 | 6 | 185 | 30 | -10.03 | 12.96 | 0.59 |
9 | 3 | 2 | 173 | 49 | 29.27 | 13.29 | 0.66 |
10 | 5 | 0 | 133 | 38 | -5.70 | 10.14 | 0.24 |
-- List of Search Criteria for each Scan column:
Criteria 1: [ = 1] Index to dictionary Criteria 2: [ != 1]
Criteria 3: [ != 57]
Criteria 4: [ = 30]
-- Final Logical Operation: AND
-- Final DAX and No Dax Bit Vectors:
00001001 00000000 00000000
00001001 00000000 00000000
Queries - Matching Matrices - Matching Vectors: 1 / 1 / 1
19
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
DAX vs. Core only
20
Table scan: Total queries per second
0
50
100
150
200
250
300
350
400
450
500
1 2 4 8 16 32 64 96 128 192
Qu
erie
s p
er s
eco
nd
Threads
Scan Results 15M Lines, 12 Scan Cols
DAX 8 bits (Qps)
DAX 16 bits (Qps)
DAX 32 bits (Qps)
Core (Qps)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
DAX vs. Core only
21
Performance enhancement
0,00
5,00
10,00
15,00
20,00
25,00
1 2 4 8 16 32 64 96 128 192
x ti
mes
bet
ter
Threads
DAX Enhancement vs. Core only 15M Lines, 12 Scan Cols
8 bits
16 bits
32 bits
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
DAX vs. Core only
22
Scan compressed data
0
2
4
6
8
10
12
14
1 2 4 8 16 32 64 96 128 192
Qu
eri
es p
er
seco
nd
per
th
read
Number of Threads
Scan Results 15M Lines, 12 Scan Cols, 16 bit Integers
DAX 16 bits
DAX 16 bits Comp
Core
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
DAX vs. Core only
23
CPU usage
DAX DAX
Compressed Core Only
1 Thread 1,55 3,97 98,42
0
20
40
60
80
100
120
Total CPU Instructions (Billions)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda with Highlight
Data Analytics Accelerator (DAX): Overview and OpenDAX API
JDK 8 with DAX
Quartet FS: Return on experience
Apache Arrow
Getting started
1
2
4
24
5
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
• New Top-Level Apache Software Foundation Project (announced Feb 2016)
• Solves
- Re-formatting of data for cross-system communication
- Overhead in accessing of data
- Multiple copies of data in-memory
• Backed by key-developers of 13 Major Open Source projects like – Hbase, Impala, Kudu, Parquet, Phoenix, Spark, Storm etc.
• Fast
- Take Advantage of SIMD operations with better use of CPU Cache.
- Columnar memory-layout permitting random access with an efficiency of O(1) for Data Retrieval
25
Apache Arrow - Introduction
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
The only system with in-memory columnar data supporting Complex, Dynamic Schemas and Common Data Layer
26
Arrow – Memory Layout
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Arrow – Vector SCAN Operation
• Written in Java
• Generates in-memory columnar Arrow vectors
• Arrow vectors store a sequence of values in an Individual column
• Test Scenario
• Generates test data in two arrow vectors
• Scan indexed data in two vectors and check for the condition if they have same data
with accessor, vector1 == vector2
27
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Apache Arrow with SPARC M7 DAX
28
Configurations - SPARC M7-1, 7 Cores, 80 GB Memory, Oracle Solaris 11.3 OS with Java 1.8.0 Linear Performance with DAX at 3x lower CPU
0
10
20
30
40
50
60
65 Million 125 Million
256 Million
512 Million
800 Million
1 Billion
CP
U U
tliz
ati
on
in %
DATA SET
1 Million 2 Million 6 Million 65 Million 125
Million 256
Million 512
Million 800
Million 1 Billion
DAX Time (Sec) 0,0144 0,029 0,127 1,62 4,17 6,466 13,074 20,332 29,051
NODAX Time (Sec) 0,0836 0,121 0,369 3,3 7,93 13,56 28,726 39,823 60,637
0,0078125
0,015625
0,03125
0,0625
0,125
0,25
0,5
1
2
4
8
16
32
64
Tim
e(Se
c)
2X to 6X Speedup with DAX
DataSet
Lower CPU Utilization(1/3rd) with DAX Offload
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Data Analytics Accelerator (DAX): Overview and OpenDAX API
JDK 8 with DAX
Quartet FS: Return on experience
Apache Arrow
Getting started
1
2
4
29
5
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Oracle Software In Silicon Developer Cloud Free Access for Universities, Researchers, Customers and Partners
• Access M7 DAX Zones with Solaris 11.3
• Prebuilt templates to extend and customize
– Open APIs, libraries, man pages, headers
– Code examples and use cases
– Example Integration for Apache Spark
• White papers and Animation Demo
• Simple Online Click-thru license
Available now at: http://SWiSdev.Oracle.com/DAX/
30
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
After OpenWorld
What Topic
Web Page Oracle Software in Silicon Cloud
Video Software in Silicon Concept
Article What Is the SPARC M7 Data Analytics Accelerator?
Data sheet Oracle SPARC S7-2 Server Data sheet
31
Oracle.com/goto/swis