database management system trends in dbms - db2 perspective.pdf · database management system...
TRANSCRIPT
© 2011 IBM Corporation© 2013 IBM Corporation
Database Management System TrendsIBM DB2 Perspective
Namik HrleIBM Distinguished [email protected]
November, 2013
2© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
© Copyright IBM Corporation 2013. All rights reserved.U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Disclaimer
IBM, the IBM logo, ibm.com, DB2, and DB2 for z/OS are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtmlOther company, product, or service names may be trademarks or service marks of others.
3© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
4© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Traditional Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ETL ETL ETL ETL
5© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Traditional Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ETL ETL ETL ETL
6© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Traditional Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ETL ETL ETL ETL
Applications
operational analytical
7© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Traditional Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ETL ETL ETL ETL
Negative ramifications:Negative ramifications:
• Complexity➔ both in systems management and in applications
• Difficulties in supporting real time analytics • Inability to match ever more demanding SLA
requirements• High total cost of ownership
Applications
operational analytical
8© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Traditional Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ETL ETL ETL ETL
Historical reasons:Historical reasons:
• Different access patterns➔ impact on performance
• EDW as the data integration hub➔ again, impact on performance
• Different life-cycle characteristics➔ and again, impact on performance
• Different Service Level Agreements (SLA)➔ Lack of broadly available workload management capabilities➔ Choice of lower cost-of-acquisition offerings
Negative ramifications:Negative ramifications:
• Complexity➔ both in systems management and in applications
• Difficulties in supporting real time analytics • Inability to match ever more demanding SLA
requirements• High total cost of ownership
Applications
operational analytical
9© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Road to Visionary Systems Landscape
OLTP Staging Area ODS EDW Data Marts
ELT ELT ELT ELT
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)
Applications
operational analytical
➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of
the layers, ultimately leading to a single database
10© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data
Visionary Systems Landscape
Applications
operational analytical
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of the
layers, ultimately leading to a single database
11© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data
Visionary Systems Landscape
Applications
operational analytical
ChallengesChallenges➔ Mixed workload management capabilities➔ Ensuring continuous availability, security and
reliability➔ Providing seamless scale-up and scale-out➔ Providing universal processing capabilities to
deliver best performance for both transactional and analytical workloads without the need for excessive tuning
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of the
layers, ultimately leading to a single database
12© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data
Visionary Systems Landscape
Applications
operational analytical
ChallengesChallenges➔ Mixed workload management capabilities➔ Ensuring continuous availability, security and
reliability➔ Providing seamless scale-up and scale-out➔ Providing universal processing capabilities to
deliver best performance for both transactional and analytical workloads without the need for excessive tuning
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of the
layers, ultimately leading to a single database
ApproachesApproaches➔ Large RAM
'In-memory' databases
➔ Massively parallel processing Large number of sockets, cores, servers Vector processing
➔ Hardware acceleration through special purpose processors FPGA, GPU, ...
➔ Columnar stores➔ Appliances
© 2011 IBM Corporation
IBM Confidential© 2013 IBM Corporation
13
Re-inventing In-Memory Computing
● The benefits of “in-memory” processing have been known since the onset of IT itself
➔ The fastest I/O is no I/O➔ Many software components already support practically unlimited data cache sizes➔ The limiting factor has been the cost
● The DIMM price per GB has decreased by 9.4 times since 2007➔ However, there are signs that the bottom of a down cycle might have been
reached
● Supporting very large memory is not the same as supporting in-memory computing ➔ A genuine in-memory computing product must be designed with assumption that
all (or most) of the data will be in memory at any point in time➔ Most of the traditional database management systems do not satisfy this condition
● The key promises:
➔ lightning performance without tuning➔ eliminating all the data redundancy that is traditionally created to deliver
acceptable performance
© 2011 IBM Corporation
IBM Confidential© 2013 IBM Corporation
14
In-Memory Database Challenges
● Database Management System is a state-full resource and non-volatile storage is realistically unavoidable for fast and reliable recovery➔ Many wrongly conclude that an ‘in-memory’ database does not require any disk storage
If there is no disk, the database would not be recoverable Writing logs to disk happens at commit Writing data to disk happens periodically (checkpoints, savepoints, ...)
➔ For example, SAP's HANA requires much more disk storage than the real memory
● 'Data must fit in memory' is a major limitation ➔ particularly challenging for enterprise data warehouses➔ incompatible with Big Data requirements
● Scale out is typically based on shared-nothing architecture ➔ Does not scale well for workloads that do not adhere to data-to-node affinity
● DRAM is still much more expensive than disk
● New comers will need to address many non-performance quality of service characteristics
● Eliminating data redundancy is not a realistic goal due to integration hub aspects of enterprise data warehouses
15© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
SELECT SELECT * * FROM T WHERE C1 = unique keyFROM T WHERE C1 = unique key SELECT AVG(C3) FROM TSELECT AVG(C3) FROM T
Column StoreColumn Store
Row StoreRow Store
Row-oriented vs. Column-oriented Data Store Model
C1 C2 C3 C4 C5C1 C2 C3 C4 C5 C1 C2 C3 C4 C5C1 C2 C3 C4 C5R 1
R 2
R 3
R 4
C 1 C 2 C 3 C 4 C 5 C 1 C 2 C 3 C 4 C 5
R 1
R 2
R 3
R 4
SELECT SELECT * * FROM T WHERE C1 = unique keyFROM T WHERE C1 = unique key
R 1
R 2
R 3
R 4
R 1
R 2
R 3
R 4
SELECT AVG(C3) FROM TSELECT AVG(C3) FROM T
R 1
R 2
R 3
R 4
16© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
IBM Internal Use
17© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data
Visionary Systems Landscape
Applications
operational analytical
ChallengesChallenges➔ Mixed workload management capabilities➔ Ensuring continuous availability, security and
reliability➔ Providing seamless scale-up and scale-out➔ Providing universal processing capabilities to
deliver best performance for both transactional and analytical workloads without the need for excessive tuning
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of the
layers, ultimately leading to a single database
ApproachesApproaches➔ Large RAM
'In-memory' databases
➔ Massively parallel processing Large number of sockets, cores, servers Vector processing
➔ Hardware acceleration through special purpose processors FPGA, GPU, ...
➔ Columnar stores➔ Appliances
18© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data
Visionary Systems Landscape
Applications
ChallengesChallenges➔ Mixed workload management capabilities➔ Ensuring continuous availability, security and
reliability➔ Providing seamless scale-up and scale-out➔ Providing universal processing capabilities to
deliver best performance for both transactional and analytical workloads without the need for excessive tuning
Building on proven technology baseBuilding on proven technology base➔ DB2 (both z/OS and LUW) already provide
superior technology to address most of the challenges
➔ The remaining challenge is addressed by adding special purpose processing component for analytical workloads✔ DB2 for z/OS: IBM DB2 Analytics AcceleratorIBM DB2 Analytics Accelerator✔ DB2 for LUW: BLUBLU
operational analytical
BenefitsBenefits➔ Uniform policies and procedures for security, HA,
DR, monitoring, same tools, same skills, ...➔ Efficient data movement within the system, often
not involving network (ELT vs. ETL)➔ Uniform access to any data for types of applications➔ Opportunity to remove, i.e. consolidate some of the
layers, ultimately leading to a single database
ApproachesApproaches➔ Large RAM
'In-memory' databases
➔ Massively parallel processing Large number of sockets, cores, servers Vector processing
➔ Hardware acceleration through special purpose processors FPGA, GPU, ...
➔ Columnar stores➔ Appliances
19© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 for z/OS Approach: Hybrid Database Management System
IBM DB2IBM DB2AnalyticsAnalytics
AcceleratorAccelerator
Applications DBA Tools, z/OS Console, ...
. . .. . .
Operation Interfaces(e.g. DB2 Commands)
Application Interfaces(standard SQL dialects)
DB2
LogLogManagerManager
IRLMIRLMBufferBuffer
ManagerManagerDataData
ManagerManager
20© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 for LUW Approach: BLU Acceleration
§ New innovative technology for analytic queriesNew innovative technology for analytic queries• Dynamic in-memory technology loads terabytes of data in RAM instead of hard disks. This
streamlines query workloads even when data sets exceed the size of the memory.• Columnar store scans and locates the most relevant data based on columns instead of rows, resulting
in faster processing.• New run-time engine exploits cache-aware memory management and parallel vector processing
providing multi-core and multiple data parallelism (SIMD) and allowing you to analyze data in parallel over different processor sockets and cores
• Actionable compression enables data to be analyzedin compressed format and results in further storage reduction
• Data skipping skips unnecessary processing of irrelevant orduplicate data, loading only the information that needs to beanalyzed.
§ Revolution by EvolutionRevolution by Evolution• Built directly into the DB2 kernel• BLU tables can coexists with traditional row tables,
in same schema, tablespaces, bufferpools• Query any combination of BLU or row data• Memory-optimized (not “in-memory”)
§ Value : Order-of-magnitude benefits in …Value : Order-of-magnitude benefits in …• Performance• Storage savings• Time to value
21© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 Family
• Different code bases …– DB2 for z/OS is written in PL/X and runs on z/OS
– DB2 for LUW is written in C and runs on multiple operating systems
• … but very close cooperation between development teams
• Common application interfaces– Common SQL interface
• DB2 SQL Language Council ensures consistency is maintained• DB2 SQL Reference for Cross Platform Development (over 1200 pages)
– Starburst optimizer
– pureXML
– System z Parallel Sysplex and LUW pureScale
– Bitemporal data
– Row and column access control
– …
• On-going work on uniform database administration tasks– Tools like Optim Data Studio
22© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator➔ Built on DB2 - The Industrial Strength DBMS➔ Architecture➔ Customer References➔ Powered by PureData for Analytics
23© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Synergy with System z
• Capacity on Demand and backup and recovery solutions lets you be more responsive to your needs, frees your staff up to do more important work
• "Shared data" database environment and synergies with z/OS means data is more available
• Robust z/OS – allows database serving without interruption, even in the event of an operating system function error
• System z Philosophy: The more errors prevented at the hardware and microcode levels – the less impact on applications, operations, and end users
• Highest availability on the planet➔ Continuous availability➔ Non-disruptive upgrades of hardware, operating system, applications and database systems➔ Comprehensive multi-site disaster recovery
• Unmatched end-to-end security from logon through data encryption
• System-level mixed workload management with full resource utilization➔ Special component named the Workload Manager manages all resources➔ 100% utilization, 24 hours a day➔ Most cost effective SLA
• The most cost effective platform to manage and maintain
24© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 and zEnterprise EC12
§ Faster CPU – 1.25x compared to z196 − 20-28% CPU reduction measured with DB2 OLTP workloads− 25% reduction measured with DB2 query and utilities
workloads− Less compression overhead with DB2 data (1-15%)
§ 50% More System Capacity to help consolidation − Excellent synergy with DB2 10 scalability
§ New Features DB2 11 plans to exploit− FLASH memory and pageable 1MB frames− 2GB frame support drive additional CPU savings− DB2 code backed by large frames for CPU reductions− Enhanced prefetch instruction for CPU reductions
§ Transactional Memory provides further possibilities for performance gains
25© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 11 Major Themes
• Performance Improvements• Improving efficiency, reducing costs, no application changes• 0-5% for OLTP, 5-15% for update intensive batch• 5-20% for query workloads• Less overhead for data de-compression• Exploitation of new zEC12 hardware features
• Continuous Availability Features• Improved autonomics which reduces costs and improves availability• Making online changes without affecting applications• Online REORG improvements, less disruption• DROP COLUMN, online change of partition limit keys• Extended log record addressing capacity (1 yottabyte)• BIND/REBIND, DDL break into persistent threads
• Enhanced business analytics• Faster, more efficient performance for query workloads• Temporal and SQLPL enhancements• Transparent archiving• SQL improvements and IDAA enhancements
• Simpler, faster DB2 version upgrades• No application changes required for DB2 upgrade• Access path stability improvements• Product stability: support pre GA customer production
26© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 for z/OS and Distributed BigData
DB2 is providing the connectors and the DB capability to allow DB2 applications to easily and efficiently access data in Hadoop
•New user-defined functions
•New generic table UDF capability
JAQL_Submit
HDFS_Read is a user-defined table function to read a file in Hadoop file system.The output schema is determined at query time.JAQL_Submit is a user-defined scalar function to submit a JAQL script to BigInsight
HDFS_Read
27© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
➔ Built on DB2 - The Industrial Strength DBMS➔ Architecture➔ Customer References➔ Powered by PureData for Analytics
28© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 Components
Applications DBA Tools, z/OS Console, ...Operation Interfaces
(e.g. DB2 Commands)Application Interfaces
(standard SQL dialects)
DB2
. . . LogManager
IRLMBuffer
ManagerData
Manager
29© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
IBM DB2 Analytics Accelerator as a Virtual DB2 Component
Accelerator
Applications DBA Tools, z/OS Console, ...
. . .
Operation Interfaces(e.g. DB2 Commands)
Application Interfaces(standard SQL dialects)
DB2
LogManager
IRLMBuffer
ManagerData
Manager
30© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 Becomes a Hybrid Database Management System
IBM DB2IBM DB2AnalyticsAnalytics
AcceleratorAccelerator
Applications DBA Tools, z/OS Console, ...
. . .. . .
Operation Interfaces(e.g. DB2 Commands)
Application Interfaces(standard SQL dialects)
DB2
LogLogManagerManager
IRLMIRLMBufferBuffer
ManagerManagerDataData
ManagerManager
31© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Connectivity Options
Multiple DB2 systems can connect to a single accelerator
A single DB2 system can connect to multiple accelerator
DB2 Accelerator
• residing in the same LPAR• residing in different LPARs• residing in different CECs• being independent (non-data sharing)• belonging to the same data sharing group• belonging to different data sharing groups
Multiple DB2 systems can connect to multiple accelerator
DB2
DB2 DB2
Accelerator
Accelerator
DB2Accelerator Accelerator
Full fl exibility for DB2 systems:Policy based workload managementPolicy based workload managementBetter utilization of accelerator resourcesBetter utilization of accelerator resources
ScalabilityScalabilityHigh availabilityHigh availability
32© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
DB2 for z/OS: Query Execution Process Flow
DB2 for z/OS
Optimizer
IDAA
Application
ApplicationInterface
Queries executed with IDAA
Queries executed without IDAA
Heartbeat (IDAA availability and performance indicators)
Query execution run-time for queries that cannot be or should
not be off-loaded to IDAA
SPU
CPU FPGA
Memory
SPU
CPU FPGA
Memory
SPU
CPU FPGA
Memory
SPU
CPU FPGA
Memory
SM
P H
ost
Heartbeat
IDA
A D
RD
A R
eque
stor
33© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Data Synchronization Options
Synchronization options Use cases, characteristics and requirements
Full table refresh
The entire content of a database table is refreshed for accelerator processing
§ Existing ETL process replaces entire table
§ Multiple sources or complex transformations
§ Smaller, un-partitioned tables
§ Reporting based on consistent snapshot
§ Need for refresh automatically detected
Table partition refresh
For a partitioned database table, selected partitions can be refreshed for accelerator processing
§ Optimization for partitioned warehouse tables, typically appending changes “at the end”
§ More efficient than full table refresh for larger tables
§ Reporting based on consistent snapshot
§ Need for refresh automatically detected
Incremental update
Log-based capturing of changes and propagation to IDAA with low latency (typically few minutes)
§ Scattered updates after “bulk” load
§ Reporting on continuously updated data (e.g., an ODS), considering most recent changes
§ More efficient for smaller updates than full table refresh
§ Applications can request reporting on committed data only
34© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
High Availability Configuration
System zDB2 for z/OS
IDAA 1
Tab 1
Tab 3
Tab 2
IDAA 2
Tab 1
Tab 3
Tab 2
Tab 1
Tab 2
Tab 3Tab 4
Tab 5
35© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Disaster Recovery Configuration Example: Prior to Disaster
System z
DB2
CF
IDAA 1
Member 1
Tab 1
Tab 3
Tab 2
System z
DB2
CF
Member 2
IDAA 2
Tab 1
Tab 3
Tab 2
Tab 1
Tab 2
Tab 3Tab 4
Tab 5 Tab 1
Tab 2
Tab 3Tab 4
Tab 5synchronous replication
Site A Site B
36© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
System z
DB2
CF
IDAA 1
Member 1
Tab 1
Tab 3
Tab 2
System z
DB2
CF
Member 2
IDAA 2
Tab 1
Tab 3
Tab 2
Tab 1
Tab 2
Tab 3Tab 4
Tab 5 Tab 1
Tab 2
Tab 3Tab 4
Tab 5synchronous replication
Site A Site B
Disaster Recovery Configuration Example: Disaster Happens
37© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
System z
DB2
CF
IDAA 1
Member 1
Tab 1
Tab 3
Tab 2
System z
DB2
CF
Member 2
IDAA 2
Tab 1
Tab 3
Tab 2
Tab 1
Tab 2
Tab 3Tab 4
Tab 5 Tab 1
Tab 2
Tab 3Tab 4
Tab 5
Site A Site B
Disaster Recovery Configuration Example: After Disaster
38© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
High Performance Storage SaverMajor saving of host disk space for historical data
Year Year -7Year -2 Year -3 Year -4 Year -5Year -1
Historical Data
Current Data
One Quarter = 3.57% of 7 years of data
One Month = 1.12% of 7 years of data
One month = 2.78% of 3 years of data
4Q 4Q
1Q
2Q
3Q
4Q
1Q
2Q
3Q
4Q
1Q
2Q
3Q
4Q
1Q
2Q
3Q
4Q
1Q
2Q
3Q
1Q
2Q
3Q
4Q
1Q
2Q
3Q
39© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
High Performance Storage SaverStoring historical data in accelerator only
Accelerator
Part #1Query from Application
Or
No longer present on DB2 Storage
Part #1 Part #2 Part #3 Part #4 Part #5 Part #6 Part #7
DB2
Active Historical
§ Time-partitioned tables where:– only the recent partitions are used in a transactional context (frequent data
changes, short running queries) – the entire table is used for analytics (data intensive, complex queries).
§ High Performance Storage Saver’s “Archive” Process:– Data is loaded into Accelerator if not already loaded– Automatically takes Image Copy of each partition to be archived– Automatically remove data from DB2 archived tablespace partitions– DBA starts archived partitions as read-only
40© 20 1 1 IBM CorporationIBM Confidential© 20 1 3 IBM Corporation
Data Residency to Match Query Types
DB2Table A
AcceleratorTable A
Applications
DB2Table A
SQL
§ Transactional onlyTransactional only § Active data onlyActive data only
§ Historical data onlyHistorical data only
§ Active & historical dataActive & historical data
§ Mixed workloadMixed workload
§ Mixed workloadMixed workload
§ Active data onlyActive data only
AcceleratorTable AActive & Historical
DB2 Table A
Active
Query Types
41© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
➔ Built on DB2 - The Industrial Strength DBMS➔ Architecture➔ Customer References➔ Powered by PureData for Analytics
42© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Major US Healthcare Insurance Company
The company provides a range of insurance products and related services for 10s of millions of members. The network includes 100,000s of doctors, 1000s of hospitals and nearly a million other healthcare professionals.
Customer Benefits•Enables the company to meet stringent on-time reporting requirements with the solution’s incremental update feature
•Anticipates a significant reduction in storage costs with the data server’s high-performance storage saver feature
•Processes some queries thousands of times faster, reducing query times from nearly 3 hours to 6 seconds
Business challengeThe changing healthcare landscape drove the company to ensure it could manage a massive influx of data and the mounting reporting requirements as the Affordable Care Act ushers tens of millions of new customers into the insurance market.
The DB2 Analytics Accelerator The DB2 Analytics Accelerator greatly exceeded our greatly exceeded our expectations. The first time we expectations. The first time we ran our very resource-intensive ran our very resource-intensive queries on the solution, queries on the solution, queries which had historically queries which had historically taken hours to run, they ran in taken hours to run, they ran in seconds.seconds.
- Systems Engineering Manager
Customer References
43© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Example:
Customer Table~ 5 Billion Rows
300 Mixed Workload Queries
Times Faster
Query
Total Rows
Reviewed
Total Rows
Returned Hours Sec(s) Hours Sec(s)Query 1 2,813,571 853,320 2:39 9,540 0.0 5 1,908Query 2 2,813,571 585,780 2:16 8,220 0.0 5 1,644Query 3 8,260,214 274 1:16 4,560 0.0 6 760Query 4 2,813,571 601,197 1:08 4,080 0.0 5 816Query 5 3,422,765 508 0:57 4,080 0.0 70 58Query 6 4,290,648 165 0:53 3,180 0.0 6 530Query 7 361,521 58,236 0:51 3,120 0.0 4 780Query 8 3,425.29 724 0:44 2,640 0.0 2 1,320Query 9 4,130,107 137 0:42 2,520 0.1 193 13
DB2 Only DB2 with
IDAA
270 queries continue to execute in DB2
returning results in seconds or sub-
seconds
30 complex, expensive queries got routed to IDAA and reduced elapsed time and CPU cost by orders of magnitude.
Customer References
44© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Large European Insurance Company
Business challenge:With roughly 2.5 billion transactions in the company’s financial data store, fast and accurate analysis is essential for setting the right premiums. To improve access to claims data across its multiple international locations, the company needs to increase system availability, optimize workloads, speed queries and accelerate the generation of claims reports run by internal business users.
Solution:Deploy the IBM zEnterprise System with DB2 for z/OS to process all data loads from a central location, and IBM DB2 Analytics Accelerator to deliver faster responses to individual analytic queries.
Benefits§ Speeds report generation by as much as 70 percent through
faster query response time, and improves staff efficiency by centralizing data on a single platform
§ Reduces processing costs and CPU consumption by routing eligible workloads to the accelerator
§ Increases satisfaction among internal business users by delivering a comprehensive overview of claims transactions that integrates operational data with advanced analytics
We were surprised by the We were surprised by the performance gain IBM DB2 performance gain IBM DB2 Analytics Accelerator Analytics Accelerator provided, as well as its ability provided, as well as its ability to further boost the capacity of to further boost the capacity of our IBM zEnterprise System.our IBM zEnterprise System.
- Director of Operations
Customer References
45© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Major Italian BankBusiness challenge:One of the largest banks in Italy. Employing 1,000s of people and generating annual revenue of 100s of millions, the group provides banking, insurance and asset management services from more than 1000 branches across the country. They wanted to meet their growth objectives by identifying customer demand for new products or services, then adapting their offerings to win the new business. The challenge was extracting actionable insight from its big data, as the size of its databases made queries from business users frustratingly slow.
Solution:The bank created the “big data project” – an initiative to develop the infrastructure to support the analytics requirements of the business. As a first step, the bank implemented IBM DB2 Analytics Accelerator on its existing IBM System z mainframes. The DB2 Analytics Accelerator inherits all of the benefits of System z – including security, performance and scalability
Benefits• Offers rapid time-to-insight for 1,000 business users – informing the
development of new products, services and strategies.• Enables the bank to match its offering with customer demand –
driving business growth in line with corporate objectives.• Creates a platform for future innovation, including data mining from
IBM SPSS and marketing management from IBM Campaign.
Being a leader in the Being a leader in the banking industry requires a banking industry requires a strong commercial offering strong commercial offering that meets fast-evolving that meets fast-evolving customer expectations. To customer expectations. To understand what your understand what your customers want, you need customers want, you need an excellent grasp of your an excellent grasp of your business data, and to business data, and to develop new products and develop new products and services, you need the services, you need the ability to deliver those ability to deliver those insights rapidly to the right insights rapidly to the right people in the business.people in the business.
- Chief Information Officer
Customer References
Video LinkVideo Link
46© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
The IBM DB2 for z/OS is a secure and highly The IBM DB2 for z/OS is a secure and highly
available repository for the bank's data. High-available repository for the bank's data. High-
performance specialty processors have performance specialty processors have
significantly improved query response timessignificantly improved query response times
as compared to our previous solution. The new as compared to our previous solution. The new
zEnterprise hybrid technology is zEnterprise hybrid technology is highly scalable highly scalable
and flexibleand flexible which means that our users are now which means that our users are now
able to access the information they need more able to access the information they need more
quickly.quickly.
– Chief Information Officer
Benefits:• Less time for tuning of SQL statements • No data base maintenance – define
tables/refresh data• Faster, more agile development• Coexistence of OLTP and DWH databases on
same LPAR• CPU saving because of redirecting execution to
IBM DB2 Analytics Accelerator
SQL DB2 on z196
Stand-alone Netezza
Exadata DB2 with DB2 Analytics Accelerator
Query 1 00:01:50 00:00:04 00:00:09 00:00:03
Query 2 00:75:31 00:00:09 00:00:39 00:00:04
Query 3 00:00:46 00:00:05 00:00:13 00:00:02
Business challenge:Experienced performance issues with its data warehouse. Required to supply financial activity reports to European Central Bank (ECB) by 9 am every business day. Performance issues were seriously hindering bank’s ability to meet this objective.The bank needed a technology solution that would address and eliminate performance issues and enable timely financial reporting to support compliance requirements.
Large Central European Bank
Customer References
47© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
➔ Built on DB2 - The Industrial Strength DBMS➔ IBM DB2 Analytics Accelerator Architecture➔ Customer References➔ Powered by PureData for Analytics
48© 2011 IBM Corporation
IBM Confidential
© 2013 IBM Corporation
§ User Data Capacity: 192 TB*§ Data Scan Speed: 478 TB/hr* § Power Requirements: 7.5 kW
§ Cooling Requirements: 27,000 BTU/hr
* 4X compression assumed
Scales from ½ Rack to 4 Racks
12 Disk Enclosures§ 288 600 GB SAS2 Drives
➢ 240 User Data, 14 S-Blade➢ 34 Spare
§ RAID 1 Mirroring
2 Hosts (Active-Passive)§ 2 6-Core Intel 3.46 GHz CPUs§ 7x300 GB SAS Drives§ Red Hat Linux 6 64-bit
7 PureData for Analytics S-Blades™§ 2 Intel 8 Core 2+ GHz CPUs§ 2 8-Engine Xilinx Virtex-6 FPGAs§ 128 GB RAM + 8 GB slice buffer§ Linux 64-bit Kernel
Powered by PureData System for Analytics N2001
49© 2011 IBM Corporation
IBM Confidential
© 2013 IBM Corporation
• HX5 Blade• 128 GB RAM• 16 Intel cores
• BPE4 Side Car• 16 GB RAM• 16 Virtex-6 FPGA cores• SAS Controller
N2001 Snippet-BladeTM (S-Blade) Components
Netezza DB Accelerator
IBM BladeCenter Server
50© 2011 IBM Corporation
IBM Confidential
© 2013 IBM Corporation
N2001: Speed Through Taking Most of Streaming Capabilities
FPGA CoreCPU Core
DecompressProjectRestrictVisibility
Complex ∑Joins, Aggs, etc.
S-BladeTable Cache
DB2 for z/OS130 MB/s
1300 MB/s
1000 MB/s1000 MB/s
4x compressionassumed
130 MB/s
65 MB/s
2.5 drives per core
325 MB/s
51© 2011 IBM Corporation
IBM Confidential
© 2013 IBM Corporation
N1001 N2001
Blade type HS22 HX-5
CPU sockets & cores per blade 2 x 4 Core Intel CPUs 2 x 8 Core Intel CPUs
# Disks96 x 3.5” / 1 TB SAS(92 Active)
288 x 2.5” / 600GB SAS2(240 Active)
Raw Capacity 96 TB 172.8 TB
Total Disk Bandwidth ~11 GB/s ~32 GB/s
S-Blades per Rack (cores) 14 (112) 7 (112)
S-Blade Memory 24 GB 128 GB
Rack Configurations ¼, ½, 1, 1 ½, 2, 3, … 10 ½, 1, 2, 4
FPGA Cores / Blade8(2 x 4 Engine Xilinx FPGA)
16(2 x 8 Engine Xilinx Virtex 6 FPGA)
User Data / Rack(assuming 4x compression)
128 TB 192 TB
IBM PureData System for Analytics Models Comparison
52© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
IBM DB2 Analytics Accelerator Supports All Models
N1001 Models 002 005 010 015 025 030 040 060 080 100
Cabinets ¼ ½ 1 1 ½ 2 3 4 6 8 10
S-Blades 3 6 12 18 24 36 48 72 96 120
Processing Units 24 48 96 144 192 288 384 576 768 960
Capacity (TB) 8 16 32 48 64 96 128 192 256 320
Effective Capacity (TB)* 32 64 128 192 256 384 512 768 1024 1280
N2001 Models 005 010 025 040
Cabinets 1/2 1 2 4
S-Blades 4 7 14 28
Processing Units 64 112 224 448
Capacity (TB) 24 48 96 192
Effective Capacity (TB)* 96 192 384 768
Capacity = User data spaceEffective Capacity = User data space with compression (4x compression assumed)
53© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Agenda
• Business and Technology Drivers
• IBM DB2 Technology
• IBM DB2 Analytics Accelerator
➔ Built on DB2 - The Industrial Strength DBMS➔ Architecture➔ Customer References➔ Powered by PureData for Analytics➔ Strategy and Roadmap
© 2013 IBM Corporation54
DB2 for z/OSDB2 for z/OS
Strategy
QueryQueryAcceleratorAccelerator
StorageStorageSaverSaver
ELTELTAcceleratorAccelerator
OLTPOLTP
Enable DB2 transition into a truly universal DBMS that provides best characteristics for both OLTP and analytical workloads.
AdvancedAdvancedAnalyticsAnalytics
■ Complement DB2's industry leading transactional processing capabilities
■ Provide specialized access path for data intensive queries
■ Enable real and near-real time analytics processing
■ Execute transparently to the applications■ Operate as an integral part of DB2 and System z■ Reusing industry leading PDA's query and
analytics capabilities and take advantage of future enhancements
■ Extend query acceleration to new, innovative usage cases, such as:
– in-database transformations– advanced analytical capabilities– multi-temperature and storage saving solutions
■ Ultimately allow consolidation and unification of transactional and analytical data stores
© 2013 IBM Corporation55
Roadmap
QueryQueryAcceleratorAccelerator
StorageStorageSaverSaver
ELTELTAcceleratorAccelerator
UnifiedUnifiedStoreStore
AdvancedAdvancedAnalyticsAnalytics
PDA technology evolutionIm
provements of existin
g features
enhancing current capabilities
enabling more query acceleration
increasing IDAA transparency
supporting new use cases
V1
V2V3
V4
56© 2011 IBM CorporationIBM Confidential© 2013 IBM Corporation
Fast Evolution of IBM DB2 Analytics Accelerator
• Version 1➔ IBM Smart Analytics Optimizer➔ In-memory, column-store, multi-core and SIMD algorithms➔ Discontinued and replaced by IBM DB2 Analytics Accelerator
• Version 2➔ New name: IBM DB2 Analytics Accelerator➔ Incorporates Netezza query engine➔ Preserves key V1 value propositions and adds many more
• Version 3➔ Better performance, more capacity➔ Incremental update➔ High Performance Storage Server
• Version 4➔ Much broader acceleration opportunities➔ More enterprise features
Nov 2010Nov 2010
Nov 2011Nov 2011
Nov 2012Nov 2012
Nov 2013Nov 2013
© 2013 IBM Corporation57
IDAA V3 Highlights
Generally available since November 2012
(1) – features retrofitted to V2
■ Propagating DB2 changes to the accelerator as they happen: Incremental Update■ Reducing disk storage cost by archiving data in the accelerator and maintaining the
excellent performance for analytical queries: High Performance Storage Saver■ Workload Manager integration■ Automatic detection of needs to refresh data in the accelerator■ More query routing control for applications (all, eligible)■ More query offload (e.g. DB2 OLAP functions)■ Speeding-up data refresh and reducing associated CPU cost on System z (1)
■ Accelerating in-database transformation (1)
■ Enhancing high availability and scaling out (1)
■ Improving performance of queries that generate very large result sets (1)
■ Supporting multi-byte EBCDIC data encoding (phase 1) (1)
■ Increasing capacity to more than 1 petabyte (1)
■ Support for SAP workloads (1)
© 2013 IBM Corporation58
IDAA V3 Highlights
Additions since GA
■ Additional query engine: PureData System for Analytics N2001■ Support for Netezza operating system 7■ Further reduction of CPU time associated with IDAA load process
– Up to 30%
– Enhancements in DFSMS BSAM routines managing data on the USS pipes
– z/OS PTFs:• z/OS V1.12 UA68971• z/OS V1.13 UA68972• z/OS V2.1 UA68973
■ Multiple time zones in the same accelerator ■ Limited support for LOCAL DATE setting■ Support for BITAND and TIMESTAMPDIFF functions■ Support for DECFLOAT when used as implicit cast
– e.g. when comparing different data types
■ Enhancements to incremental update
© 2013 IBM Corporation59
Version 4 at a Glance
More Query Acceleration Enhanced Capabilities Improved Transparency
Static SQLGreatly improved scalability of Incremental Update
Automatic workload balancing with multiple accelerators
DB2 11 (2) Better performance of Incremental Update New RTS 'last-changed-at' timestamp (2)
Multi-row fetch from local applicationsImproved performance for large result sets (2) Automated NZKit installation
EBCDIC and Unicode in the same DB2 system and accelerator
Better access control for HPSS archived partitions
Built-in Restore for HPSS
HPSS archiving to multiple accelerators Protection for image copies created by HPSS archiving process
Extending WLM support to local applications Profile controlled special registers (2)
Rich system scope monitoringImproved continuous operations for Incremental Update
Reporting prospective CPU cost and elapsed time savingsSeparation of duties for accelerator system administration operations
Loading from flat file or image copy (1)
Loading in parallel to DB2 and accelerator (1)
Loading data as of any past point in time (1)
Loading data to accelerator only (1)
E n a b l i n g n e w u s e c a s e s
(1) – delivered by a separate tool(2) – DB2 11 only