Download - Informix IWA: Architectural options
Architectural Options with IWA
Keshav MurthyIBM Informix Development
• Data Warehouse query Performance without Perspiration
• Consistent query performance without tuning efforts.
• More questions, faster answers, better data driven decisions & business insights
• SKECHERS: Acceleration from 60x to 1400x – average acceleration of 450x
Motivation
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to accelerator
Ready for Queries
IBM Smart Analytics Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
Informix Ultimate Warehouse edition
4
INTEL/IWA: Breakthrough technologies for
performance
1
2
34
5
6
7 1
2
34
5
6
7
1. Large memory support64-bit computing; System X with MAX5 supports up to 6TB on a single SMP box; Up to 640GB on each node of blade center. IWA: Compress large dataset and keep it in memory; totally avoid IO.
7. Multi-core, multi-node environmentNehalem has 8 cores and Westmere 10 cores. This trend is expected to continue. IWA: Parallelize the scan, join, group operations. Keep copies of dimensions to avoid cross-node synchronization.
4. Virtualization PerformanceLower overhead: Core micro-architecture enhancements, EPT, VPID, and End-to-End HW assist IWA: Helps informix and IWA to seemlessly run and perform in virtualized environment.
5. Hyperthreading 2x logical processors; increases processor throughput and overall performance of threaded software. IWA: Does not exploit this since the software is written to avoid pipeline flushing.
3. Frequency PartitioningIWA: Enabler for the effective parallel access of the compressed data for scanning. Horizontal and Vertical Partition Elimination.
2. Large on-chip CacheL1 cache 64KB per core, L2 cache is 256KB per core and L3 cache is about 4-12 MB. Additional Translation lookaside buffer (TLB). IWA: New algorithms to avoid pipeline flushing and cache hash tables in L2/L3 cache
6. Single Instruction Multiple DataSpecialized instructions for manipulating 128-bit data simultaneously. IWA: Compresses the data into deep columnar fashion optimized to exploit SIMD. Used in parallel predicate evaluation in scans.
Store_sales data martStore_sales data mart
Options Optimized Platforms
Source: BI Research, 2013
Use Case Application ExampleReal-Time Monitoring & Analytics In-line fraud detection to reduce
financial losses caused bystolen credit cards
Near-Real-TimeAnalytics Next best customer offer to the channel to increase customer satisfaction & reduce churn
Data Integration Hub Collect and manage all sales-related detailed data (POS, web, supply chain) for down stream analysis
Analytics Accelerator Offload & boost the performance of selected financial analyses to increase satisfaction/retention of key clients
New LOB Analytic Application Manage & monitor spot buying onweb advertising exchanges
Investigative ComputingPlatform
Evaluate the effectiveness ofdifferent social computing channels
Starting Point
Source: BI Research, 2013
• Ten different machine configurations• Five ways to sync data• Just combining the two options above
Informix Warehouse AcceleratorDeployment Options
Ingredients1. Computer hardware2. Informix3. IWA
• This is typically part of a larger IT system and workflow
• We’ll be focusing on options for each of these components.
Ingredients: Computer Hardware
• Computer hardware– Single SMP system – Multiple SMP systems– Single Cluster systems– Multiple cluster systems
Ingredients: Computer Hardware
• Single SMP system– Informix and IWA running on the same system– Should be a high-memory system.
• IBM System X with MAX5 can go up to 3TB with DIMMs up to 16GB.
• http://www-03.ibm.com/systems/data/flash/systemx/hardware/ddr3//
– The machine has to be based on Intel Xeon with SSE
– Informix can be running OLTP or OLAP workload– Limit the number of CPU VPs and the number of
cores for IWA– Set the SHMTOTAL and memory for IWA
Ingredients: Computer Hardware
• Multiple SMP systems– Informix and IWA running on the separate
systems– IWA machine should be an Intel Xeon based
processor with high-memory.– Informix machine can be:
• Linux on Intel• Linux or AIX on Power• Solaris on Intel or Sparc• HP/UX on Itanium
– Data is transferred from Informix instance to IWA.
Ingredients: Computer Hardware
• Computer hardware– Single SMP system – Multiple SMP systems– Single Cluster systems– Multiple cluster systems
Let’s discuss Informix and IWA on cluster systems.i.e., Informix MACH11IWA on a multi-node cluster.
16
Informix Database Server
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart
Step 5. Load data to accelerator
Ready for Queries
IBM Smart Analytics Studio
Step 1
Step 2
Step 3
Step 4
Step 5
Ready
Informix Warehouse Accelerator – In 11.70.FC4Informix Warehouse Accelerator – In 11.70.FC4
Design DM by workload analysis or manually
Deployed datamart
Datamart Deleted
Datamart in USE
Datamart Disabled
Partition based refresh
Trickle feed refresh
Deploy
Load
Drop
DisableFull Load/ Enable
Drop
Complete view of Data mart state transitions.
Background• Prior to 11.70.FC5, adding accelerator, create, deploy, load,
enable, disable datamart, accelerating queries – are all operations officially supported only on Standard server or Primary node of MACH11/HA environment.
• We estimate about 50% of Informix customers use HDR secondary servers and growing number of customers use MACH11 (SDS secondary) configurations and RSS nodes. MACH11 is the Informix scale out solution.
• IWA itself supports a scale out solution (on a cluster) starting with 11.70.FC4.
• Reasons to support MACH11 and IWA together.– This feature will enable partitioning a cluster or HA group between OLTP
and BI workload.– This feature will give help to off-load the expensive LOAD functionality to
secondary servers– We have customers now requesting support for HDR secondary to IWA
19
Informix Primary
Informix warehouse Accelerator
BI Applications
Step 1. Install, configure,start Informix
Step 2. Install, configure,start Accelerator
Step 3. Connect Studio to Informix & add accelerator
Step 4. Design, validate, Deploy Data mart from Primary, SDS, HDR, RSS
Step 5. Add IWA to sqlhostsLoad data to Accelerator from any node.
Ready for Queries
IBM Smart Analytics Studio
Step 1
Step 3
Step 4
Step 5
Ready
Informix Warehouse Accelerator – 11.70.FC5. MACH11 SupportInformix Warehouse Accelerator – 11.70.FC5. MACH11 Support
Informix
SDS1
Informix
SDS2
Informix
HDR
Secondary
Informix
RSS
Step 2
1. Machine Summary
1. Informix can be in any of the following
2. IWA can be running in any of the following
• Single SMP system
• Multiple SMP systems
• Single Cluster systems
• Multiple cluster systems
3. You can mix and match for scale-out (performance), high availability, application evolution, migration and any number of reasons.
1. Hardware Configuration
• All in one – single system – All on Linux on Intel
• Multiple systems
• Homogeneous systems – Linux on Intel
• Heterogeneous systems
• Informix on Linux on Intel/Power, AIX on Power, Solaris on Intel/Sparc
• IWA Linux on Intel
• Informix on Cluster and IWA on single node
• Informix on single system and IWA on cluster
• Informix on cluster and IWA on cluster
• Informix on combination of Cluster and smp system; IWA on cluster or smp system
• Single/multiple informix to single/multiple iwa.
1. Informix Topology
• Informix single node.
• Informix Primary+SDS
• Informix Primary + HDR + RSS
• Informix Primary +SDS + RSS
• Informix Flexible grid
1. IWA Topology
• Single node
• Multiple single node systems
• Single cluster
• Multiple clusters
• combo of single/multi
• Features
• Informix Warehousing
• IWA Acceleration
• Multiple data marts with same definition
• MACH11 support
• Heterogeneous platform support
• Data sync – refresh mart
• Data sync – trickle feed
• Data mart – External table
• Data mart Timeseries acceleration
Informix Database Server
BI Applications
Step 1. Create the Sales-Mart and load it. Sales is the facttable -- range partitioned.
Step 2. Load jobs update the fact table “sales”Only updates existing partition
Step 3. Identify the partition,execute dropPartMart().
Step 4. for same partition,execute loadPartMart().
Ready for Queries
IBM Smart Analytics Studio or stored procedures or command line toolStep 1
Step 4
Step 2
Step 3
Ready
Case 1: Partition refresh: Updates to existing Partitions
Sales-Mart
salescustomer
stores
IWA
OLTP Apps
partitioned fact table
SQL Script: call Stored procedure
Modified partition
INSERT, UPDATE, DELETE
Informix Database Server
BI Applications
Step 1. Create the Sales-Mart and load it. Sales is the facttable -- range partitioned.
Need to move the Time window to next range.
Step 2. DETACH operationa. Execute dropPartMart()b. DETACH the partition
Step 3. ATTACH operationa. ATTACH the partitionb. Execute loadPartMart()
Ready for Queries
IBM Smart Analytics Studio or stored procedures or command line toolStep 1
Step 3Step 2
Ready
Case 2: Partition refresh: Time Cyclic data management
Sales-Mart
salescustomer
stores
IWA
OLTP Apps
partitioned fact table
Move the window.
Design DM by workload analysis
or manually
Deployed datamart
Datamart DeletedDatamart in USE
Datamart Disabled
Partition based refresh
Trickle feed refresh
Deploy
Load
Drop
Disable
Enable
Drop
Data Refresh: RefreshMart Implementation :new stored procedure :
ifx_refreshMart(
'accelerator_name',
'data_mart_name',
'locking_mode',
NULL);
locking_mode is optional : can be NULL
4th parameter : not used as of now
if used while new functionality “trickleFeed” is active :
ifx_refreshMart() will not refresh fact tables for which trickleFeed is active.
Data Refresh: RefreshMart :
granularity based on table partitions
data mart remains available for query acceleration
single call of stored procedure for ease of use
control of execution remains with administrator
handles all data changes, including fragment operations
data consistency via lock mode parameter
prerequisite :sysadmin database accessible for administrator
Informix Database Server
Step 1. Create the Sales-Mart and load it. Sales is the factTable, customer and stores Dimension tables.
Step 2 Setup tricklefeed bycalling ifx_setupTrickleFeed
Step 3. Let application roll.Do the inserts on fact andUpdates on any dimensions.
Step 4. As the applicationsruns, the reports see newData updated on IWA
IBM Smart Analytics Studio or stored procedures or command line toolStep 1
Step 3
Step 2
Data Refresh: Scenario for Real-time trickle feed.
Sales-Mart
salescustomer
stores
IWA OLTP Apps
fact table
Setup the trickle feed
Run the application
Step 4
Reports & BI Apps
Data Refresh: Trickle feed (cont.)
insert intofact_table ...
fact table
data row trigger
dimension table1
data row
accelerator
data mart
data row
Dbschedulertask
ifx_loadPartMart()
ifx_refeshMart()
data row
dimension table2
data row
User interface:
ifx_setupTrickleFeed( 'accelerator_name', 'data_mart_name', buffertime)
accelerator_nameThe name of the accelerator that contains the data mart.data_mart_nameThe name of the data mart.buffertime An integer that represents the time interval between refreshes and whether dimension tables are refreshed.
Examples:
execute procedure ifx_setupTrickleFeed('salesacc', ‘partsmart', 60);
execute procedure ifx_setupTrickleFeed('salesacc', 'carmart', -300);
Trickle feed (cont.)
Deep dive into interval and rolling window table partitioning in IBM Informix
Keshava Murthy IBM [email protected]
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Please Note:
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
04/12/23 34
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates.
The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.
Acknowledgements and Disclaimers:
Acknowledgements & Disclaimers:
© Copyright IBM Corporation 2013. All rights reserved.
– U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
– Please update paragraph below for the particular product or family brand trademarks you mention such as WebSphere, DB2, Maximo, Clearcase, Lotus, etc
IBM, the IBM logo, ibm.com, [IBM Brand, if trademarked], and [IBM Product, if trademarked] are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
If you have mentioned trademarks that are not from IBM, please update and add the following lines:
[Insert any special 3rd party trademark names/attributions here]
Other company, product, or service names may be trademarks or service marks of others.
Do you have a great presentation topic that you’d like to share? •We’re looking for dynamic, innovative and thought-provoking sessions•Whether your proposal aims at sharpening skills, sharing best practices, or presenting new ideas and groundbreaking concepts, all proposals are welcome•Visit the conference website to learn more
The Call for Speakers closes April 30! Hurry to submit your session!
Sign Up! Informix Usability Sandbox!Help shape the future of Informix.
Influence Informix usability and functionality.Share your experiences and feedback.
Usability Sandbox sessions in Santa Fe 3April 22-24th, between 9am and 5pm
Sign-up at the IBM Information Table or find Justin McDavid.
*The first 20 participants will get a free IBM t-shirt!
Informix RFE (Request For Enhancement) Process
As Simple as 1, 2, 31. Submit from the IM RFE site – simply complete the RFE form and click Submit when ready
Many fields will be auto-filled as a convenience for you Note that fields with the ‘key’ field e.g. Company Name and Business Justification will be
kept private for confidentiality purposes Provide as much detail as possible in the Description, Use Case, and Business
Justification fields to help the IBM team understand your requirement
2. View via Watchlist Lists all the RFEs that you’re interested in Simple to add an RFE via Search
3. Subscribe to email notifications Specify ‘Opting in for email notifications’ Notified when any change occurs to any RFE on your watch list
YouTube: http://www.ibm.com/developerworks/rfe/execute?use_case=tutorials#tut2YouTube: http://www.ibm.com/developerworks/rfe/execute?use_case=tutorials#tut2
Give it a shot! http://www.ibm.com/developerworks/rfe/
Backup
ar·chi·tec·ture /ärkitekCHər/
Noun• The art or practice of designing and constructing
buildings.• The style of a building with regard to a specific
period, place, or culture.
op·tion Noun
A benefit in the form of an option given by a company to an employee to buy stock in the company at a discount or at a stated fixed price.Surely, were neither discussing buildings or stock
options.
A system architecture or systems architecture is the conceptual model that defines the structure, behavior, and more views of a system.
An architecture description is a formal description and representation of a system, organized in a way that supports reasoning about the structures of the system, which comprise system components, the externally visible properties of those components, the relationships (e.g. the behavior) between them, and provides a plan from which products can be procured, and systems developed, that will work together to implement the overall system.
A system architecture or systems architecture is the conceptual model that defines the structure, behavior, and more views of a system.
An architecture description is a formal description and representation of a system, organized in a way that supports reasoning about the structures of the system, which comprise system components, the externally visible properties of those components, the relationships (e.g. the behavior) between them, and provides a plan from which products can be procured, and systems developed, that will work together to implement the overall system.