Exploiting Informix In-Memory and TimeSeries Technology to Accelerate NoSQL Analytics on Intel® Xeon® Servers
Sandor Szabo – IBMMartin Fuerderer – IBM
Jantz Tran – Intel®
1
Agenda
• Goal: Development of a noSQL Benchmark • Informix Warehouse Accelerator• noSQL Workload Development
• Dataset • Queries
• Test Results on Intel® Xeon® server platform• Intel® Xeon® E7 server platform overview
Workload: Benchmark goals...
l Goal to show Intel® and IBM technologyl Already reported results on POPS, TPC-DS,...l NoSQL benchmark, but NO good standard foundl Since water is an issue in California let use data from the ecosystems and environment l We defined our own benchmarkl Includes : In-Memory , noSQL, Timeseries
3
Workload: Why Informix Warehouse Accelerator (IWA)l In Memory Databasel Scaling with number of processors l Massive parallelism l Multi-Core and Vector Optimised Algorithml Need reproducible environment l No disk I/O only CPU operationl Can scale with new Intel Processor designl Behaves well with hyper-threading
4
5
IBM Informix Warehouse Accelerator
Intelligent Frequence Paritioning
64 bit Intel/AMDPrcessors
TB of RAM Memoria
Predicates Evaluationon Compressed Data
CommonValues
RareValues
Nu
mb
er
of
oc
cu
rren
ce
s
SIMD
No Need for Aggregate Tables
Row and Column Store
Compresion
IWA Technology Innovations provide:Extreme speed for fast business decisions
7
IBM Informix Warehouse Accelerator (IWA)
Results
Analyticquery
Linux on Intel / AMD 64-bit
TCP/IP
Query Optimizer
In-Memory Compressed
ColumnarDatabase Partition
Bulk Loader
Query Processor
Yes
Analyticquery
Results
AccelerateQuery?
Most Unix/Linux 64-bit platforms
In-Disk[Compressed]
Relational / Row-basedDatabase
Informix database serverInformix Warehouse
Accelerator
No
POWERFUL HYBRID DATABASE PLATFORMPOWERFUL HYBRID DATABASE PLATFORM
Extreme Performance Transactions Extreme Performance Analytics
You can use IWA’s In-Memory Analytics to Speed Up queries on…
9
Workload Overview
Workload: Data
lUse real data on rivers and streams from U.S. Geological Survey at www.usgs.gov : gage height and flowl15 minute interval values of 5 years from 847 measurement sites lExtend this data to span about 100 years by adding small percentages:l → >2 billion data records
10
Workload: Data
Example: San Joaquin River, CA
11
Satellite image: google
Workload: Data
lDimension table of measurement siteslFact table with >2 billion measurement valuesl(as 847 TimeSeries)lTimeSeries load data is in external table and has JSON format:{"gage_height":4.120, "flow":0.510}
12
Workload: Data
13
Fact table TimeSeries structure:
site_idCHAR(20)
regular TIMESERIES :
initial sensordata
BSON...
sensordata
BSON
sensordata
BSON
sensordata
BSON
sensordata
BSON
site_idCHAR(20)
regular TIMESERIES :
initial sensordata
BSON...
sensordata
BSON
sensordata
BSON
sensordata
BSON
sensordata
BSON
...
Workload: Queries
lJoin dimension with fact tablelTypical aggregations: AVG, MAX, MIN of measurementslGROUP BY measurement sitelORDER BYl→ Full fact table scan necessary
14
Workload: Query example
select d1.id, d1.name, min(f1.v1) as min_gage, max(f1.v1) as max_gage, avg(f1.v1) as avg_gage, min(f1.v2) as min_discharge, max(f1.v2) as max_discharge, avg(f1.v2) as avg_discharge from v_tstable_j f1, site_dim d1 where f1.id = d1.id group by d1.id, d1.name order by d1.id
15
16
Test Results
Test Environment
• Intel® Xeon® E7-4890 v2 (2.8GHz, 15 cores/30 threads per CPU, 37.5MB LLC)
• 1TB DDR3 memory (1333 MHz)
• 2x Intel® 910 PCIe SSDs
• Informix 12.1
• Goals: testing scaling of workload by core count and by dataset size
IWA CPU Scaling – 4GB DB
15 30 45 600
50
100
150
200
250
300
350
Query 1 - 4GB DB CPU Scaling
HTOff
HTOn
CPU Cores
Run
time
(s)
15 30 45 600
20
40
60
80
100
120
140
160
Query 2 - 4GB DB CPU Scaling
HT Off
HT On
CPU Cores
Run
time
(s)
Query 1: 15 60 core scaling (HT), 3.0xQuery 2: 15 60 core scaling (HT), 2.5x
IWA CPU Scaling – 160GB DB
15 30 45 600
2000
4000
6000
8000
10000
12000
Query 1 - 160GB DB CPU Scaling
HT Off
HT On
CPU Cores
Run
time
(s)
15 30 45 600
1000
2000
3000
4000
5000
6000
Query 2 - 160GB DB CPU Scaling
HT Off
HT On
CPU Cores
Run
time
(s)
Query 1: 15 60 core scaling (HT), 3.3xQuery 2: 15 60 core scaling (HT), 3.2x
20
Intel® Xeon® E7 Overview
Tick-Tock Development Model:Sustained Microprocessor Leadership
Intel® Core™ Microarchitecture
NewMicro-
architecture
Xeon®7300
65nm
TOCK
Xeon®7400
NewProcess
Technology
45nm
TICK
Intel® Microarchitecture
Codename Nehalem
NewMicro-
architecture
Xeon®7500
45nm
TOCK
Xeon®E7- 4800(WSM-EX)
32nm
NewProcess
Technology
TICK
Intel® MicroarchitectureCodename Sandy
Bridge
32nm
NewMicro-
architecture
TOCK
22nm
NewProcess
Technology
TICK
Intel® Microarchitecture
Codename Haswell
Haswell
22nm
NewMicro-
architecture
TOCK
Future
14nm
NewProcess
Technology
TICK
Xeon® E7- 4800 v2
(IVB-EX)
Huge Leap in Performance and Capabilities
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel® Xeon® Processor E7-8800/4800/2800 v2 Product Families
Integrated PCI Express* 3.0Up to 32 lanes per socket
Up to 3 DIMMs per channel (up to 24 DDR3 1600Mhz DIMMs per socket)
Up to 4 Intel ® C102/C104Scalable Memory Buffersper socket
Up to 37.5MB Shared CacheUp to 37.5MB Shared Cache
Intel® Xeon® ProcessorE7-4800 v2
Product Family
Intel® Xeon® ProcessorE7-4800 v2
Product Family
Up to 50% more cores and up to 25% more cache for up to 2x average top-bin performance increase1
New Advanced Reliability features for improved system uptime and data integrity
Highest memory capacity for data-demanding, transaction-intensive workloads
Improved security with Intel®
Secure Key & Intel® OS Guard for additional HW embedded security
Results were derived using simulations run on an architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
1 Results have been simulated and are provided for informational purposes only. Compared to previous generation.
Intel® Xeon® Processor E7 Family Significant Generational Improvements
Intel® Xeon® processor E7-8800/4800/2800 product families (code name Westmere EX)
Intel® Xeon® processor E7-8800/4800/2800 v2 product families (code name Ivy Bridge EX)
Process Technology 32nm 22nm
Cores / Threads Up to 10 / 20 per socket Up to 15 / 30 per socket
L3 Cache Size Up to 30M Up to 37.5M
MemoryCapacity
Up to 16 DIMMs per socket32GB max DIMM DensityUp to 2TB in 4S / Up to 4TB in 8S
Up to 24 DIMMs per socket64GB max DIMM DensityUp to 6TB in 4S / Up to 12TB in 8S
Max Memory Speed Up to 1066MHz Up to 1600MHz
I/O Bandwidth Up to 72 lanes PCIe* 2.0 (dual IOH) Up to 32 Integrated PCIe* 3.0 lanes per socket
Intel® QPI Bandwidth Up to 4 x 6.4 GT/s per socket Up to 3 x 8.0 GT/s per socket
RAS AdvancedPrevious Gen + eMCA Gen 1, MCA Recovery – Execution Path, MCA – IO, PCIe* LER
Platform Technologies
Intel® Turbo Boost Technology, Intel® TXT, Intel® Dynamic Power, Intel® VT-x, Intel VT-d, Intel® I/OAT/CB3 Technology, Intel® Node Manager, TPM 1.2 and more
Previous Gen + Intel® Secure Key + Intel® OS Guard + Intel® Integrated I/O + Intel® Direct Data I/O + Node Manager 2.0 + Intel® AVX + APICv
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
* Other names and brands may be claimed as the property of others.
Intel® Xeon® Processor E7-8800/4800/2800 v2 Product FamiliesScalability to Handle Any Datacenter Workload
2S
4S
8S
•••
XNC (>8S)
Intel® Xeon® Processor E7 FamilyMemory & Intel® C102/C104Scalable Memory Buffer
Intel® C602J Chipset
Intel® QuickPath Interconnect 3rd partry (OEM) Node Controller (XNC) (non-Intel)
OEM interconnect
LAN
Memory Platform OverviewBreakthrough Performance & Capacity
Up to 24 DIMMs per socket
Up to 4 Intel ® C102/C104Scalable Memory Buffersper socket
Up to 37.5MB Shared CacheUp to 37.5MB Shared Cache
Intel® Xeon® ProcessorE7-4800 v2
Product Family
Intel® Xeon® ProcessorE7-4800 v2
Product Family
Up to 8 DDR3 Channels
1 to 6 DIMMs per Buffer
• 8 DDR3 channels per socket• Up to 24 DDR3 DIMMs per socket• Supports up to 64GB DDR3 LR-DIMM• Up to 6TB in a 4S platform, 12TB in a 8S platform1
• 8 DDR3 channels per socket• Up to 24 DDR3 DIMMs per socket• Supports up to 64GB DDR3 LR-DIMM• Up to 6TB in a 4S platform, 12TB in a 8S platform1
Large Memory CapacityLarge Memory Capacity
• Up to 1600MHz DDR3 speeds• Intel® SMI Gen 2: Up to 2.66 GT/s• Memory Controller can support 2 modes
• Performance Mode (higher I/O, B/W)• Lockstep Mode (highest DDR3 speeds)
• Up to 1600MHz DDR3 speeds• Intel® SMI Gen 2: Up to 2.66 GT/s• Memory Controller can support 2 modes
• Performance Mode (higher I/O, B/W)• Lockstep Mode (highest DDR3 speeds)
POR Speeds, Memory Controller ModesPOR Speeds, Memory Controller Modes
• Active (Rack): Up to 9W @2.66 GT/s• Idle: Up to 2.5W
• Active (Rack): Up to 9W @2.66 GT/s• Idle: Up to 2.5W
Power (Target)Power (Target)
1 Memory capacity possible by populating all (96 for 4S; 192 for 8S) DIMMs with 64GB DDR3 LR-DIMMs
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
Conclusion
• IWA and Timeseries can provide fast in-memory analytics for noSQL data
• noSQL workload in development shows great promise in terms of taking advantage of scalability provided by Intel® Xeon® E7 platform
• Next steps: 1. Continue workload development
2. Test at scale factors
Questions?
27