ibm information management integration and stg … · ibm information management integration and...

32
IBM Information management integration and STG Smart Analytics Igor G. Gonchar IBM STG High End Systems Solutions Architect for RCIS

Upload: doandat

Post on 25-May-2018

223 views

Category:

Documents


3 download

TRANSCRIPT

IBM Information management integration and STG Smart Analytics Igor G. Gonchar IBM STG High End Systems Solutions Architect for RCIS

© 2012 IBM Corporation 2

Clients Need to Establish a Repeatable Delivery System for Information:

The data supply chain

High-Level Architecture Components:

Analyze

Integrate

Transactional & Collaborative Applications

Manage Business Analytics Applications

External Information Sources

Complex Queries

Streaming Data

Advanced Analytics

Master Data

Structured & Unstructured

Content

Data Data Warehouses

Governance Data Quality Management

Security & Privacy Lifecycle

management

3rd Parties

Social Data

Archiving & Retention

Process & Collaboration

Data Models

Data Mining

Predictive Modelling

Consume

Content Repository

Real-time analytics

© 2012 IBM Corporation 3

Delivering trusted information across your entire

information supply chain

Analyze Integrate

Transactional &

Collaborative

Applications

Manage

Business Analytics

Applications

External

Information

Sources

Cubes

Streams

Big Data

Master Data

Content

Data

Streaming

Information Govern Quality

Security & Privacy Lifecycle

Data Warehouses

Standards

Page 3

© 2012 IBM Corporation 4

IBM Solutions for Information Governance

GOVERN Quality Security & Privacy Lifecycle

InfoSphere

Information

Server

InfoSphere

Optim &

FileNet

InfoSphere

Guardium

MANAGE INTEGRATE ANALYZE

DB2, Informix

FileNet solidDB InfoSphere

MDM

Netezza & InfoSphere Warehouse

Cognos, InfoSphere Warehouse

InfoSphere Streams

InfoSphere BigInsights InfoSphere

Information Server

InfoSphere

Foundation Tools

& Industry

Models

Standards

Page 4

© 2012 IBM Corporation 5

Set of extensive Industry Data Models Business Terms

● Business Terms define industry concepts in plain business language, with no modeling or abstraction involved. Business Terms have a set of properties and are organized by Business Categories. Clearly defined business terms help standardization within a company. The mapping to the data models makes it possible to create a common enterprise-wide picture of the data requirements and to transform these requirements into IT data structures.

Analytical Requirements

● Analytical Requirements are high level grouping of business information needed and used by the enterprise to express business Measures along axes of analysis, which are named Dimensions. It allows business users to fully articulate the requirements for a piece of analysis using their business terminology. The Analytical Requirements are the basis for building the enterprise data models used to develop the IT assets that deliver the analytical requirements to the business users.

Supportive Glossary

● A Supportive Glossary is a grouping of terms incorporating any terminology originating from an internal or external source. It is used to support data structures such as regulatory reports (e.g. Basel II, IAS/IFRS, Solvency II), industry standards (ACORD, HIPAA, SEPA, SEC US GAAP, FpML, MISMO), business architecture standards (e.g. EPP), vendor interfaces (e.g. SAS, Fair Issac, Sendero, Oracle Financials), or legacy source systems (e.g. Loans systems, Underwriting systems).

Atomic

Warehouse

Model

Analytical

Requirements

Dimensional

Warehouse

Model

IBM Data Models

Supportive

Glossary

Business

Terms

Atomic Warehouse Model

● The Atomic Warehouse Model is a design level data model that represents the enterprise-wide repository of atomic data used for informational processing. This includes the historization of the value changes of business information that may vary over time, and of which the business wants to keep track for analytical purposes.

Dimensional Warehouse Model

● The Dimensional Warehouse Model is the enterprise-wide repository for analytical data. It contains star schema style dimensional data structures organized around fact entities that support the Analytical Requirements. The Dimensional Warehouse Model can be accessed directly by analytical tools or queries, or its content may be easily distributed to specific downstream data marts, if any.

© 2012 IBM Corporation 6

AnalyzeIntegrate

Transactional &

Collaborative

Applications

Manage

Business Analytics

Applications

External

Information

Sources

Cubes

Streams

Big Data

Master Data

Content

Data

Streaming

Information GovernQuality

Security & PrivacyLifecycle

Data Warehouses

Standards

AnalyzeIntegrate

Transactional &

Collaborative

Applications

Manage

Business Analytics

Applications

External

Information

Sources

Cubes

Streams

Big Data

Master Data

Content

Data

Streaming

Information GovernQuality

Security & PrivacyLifecycle

Data Warehouses

Standards

Atomic

Warehouse

Model

Analytical

Requirements

Dimensional

Warehouse

Model

IBM Data Models

Supportive

Glossary

Business

Terms

Set of extensive Industry Data Models

● The Models represent the integrated Design time basis for Data Warehouse Deployment

● The model content is provided to the IBM InfoSphere Runtime infrastructure in an ordered manner

● The Analytical Requirements are used to drive the specification of the Business Analytics requirements

● The Supporting Glossary records the upstream external sources as well as any other external Regulatory obligations

● The Central Data Warehouse is generated from either the Atomic Warehouse Model or the Dimensional Warehouse Model

● The Data Mart structure development is driven from the Dimensional Warehouse Model

© 2012 IBM Corporation 7

An integrated tooling platform addressing Business Term management and

associated Data Model development

AnalyzeIntegrate

Transactional &

Collaborative

Applications

Manage

Business Analytics

Applications

External

Information

Sources

Cubes

Streams

Big Data

Master Data

Content

Data

Streaming

Information GovernQuality

Security & PrivacyLifecycle

Data Warehouses

Standards

AnalyzeIntegrate

Transactional &

Collaborative

Applications

Manage

Business Analytics

Applications

External

Information

Sources

Cubes

Streams

Big Data

Master Data

Content

Data

Streaming

Information GovernQuality

Security & PrivacyLifecycle

Data Warehouses

Standards

Atomic

Warehouse

Model

Analytical

Requirements

Dimensional

Warehouse

Model

Supportive

Glossary

Business

Terms

● ER Data Models are managed natively IDA. Enables the Data Model Development to leverage the normal benefits of IDA (team support, integration to Cognos, DB2 , Netezza, RSA,etc)

● All of the Business-related content ( Business Terms, Analytical Requirements and Supportive Glossaries) are managed by the Business in InfoSphere Business Glossary

● Enables the deployment of the terms to a larger Business Audience and leverages the management, stewardship etc of Business Glossary

● Integration between IBG and IDA done via standard IDA plugin provided by IBG

● Enables Modellers to view, map to and develop using a synchronized copy of the Components in BG

InfoSphere Data Architect (IDA)

IBG Plugin for IDA

Analytical

Requirements

Supportive

Glossary

Business

Terms

InfoSphere Business Glossary (IBG)

Read-only view of IBG Terms for IDA Users

© 2012 IBM Corporation 8

Dimension traceability to Atomic Warehouse Model

Traceability from Dimensions in Dimensional Model to Atomic Warehouse Model

• IDA Dependencies used

• Traceability from the entities and attributes

© 2012 IBM Corporation 9

Dimensional Model in Cognos

The Dimensional Models export directly to the Cognos Framework via the IDA/Cognos bridge

All Facts, Measures and Dimensions defined in IDA are maintained during the export

The Star Schemas defined in the Dimensional Model form the basis of Packages in Cognos

Framework Manager

These in turn can be exported to the Cognos server for report generation

IDA to Cognos

© 2012 IBM Corporation 10 10

Metadata Server

Foundation Tools & Beyond

Assess, Monitor,

Manage Data Quality

Rules

Information Analyzer Business Glossary

Links

DataStage & QualityStage

Generate Logic to Load

Warehouse Map Sources to

Target Model

FastTrack

Simplification & Content: reduces project time, risk and cost!

Cognos

Deliver Reports

Define Business

Requirement & Glossary

Discovery

Find Data

Relationships &

Transformation Rules

Create Business

Objects

2

3

4 6

7

5

Populates

Establish Platform

Import & Enhance

Industry Model

Data Architect

1

© 2012 IBM Corporation 11

12/2/2013

IBM Smart Analytics System P&X Standard Configuration

InfoSphere Warehouse

Cubing Services

Cognos 10.2 BI

ELT

Operational Source Systems Structured/ Unstructured Data

Data Warehouse

System P or X

Implementation Services and AVP

DB2

DB2 Utilities Suite Image Copy, LOAD, UNLOAD, REORG, etc

SPSS Modeler

© 2012 IBM Corporation 12

Incremental Update

ELT or ETL

Table or Partition Update

Change Data Capture

Incremental Update

OLTP Application

Data Warehouse

Analytics Accelerator

Synchronizing data to lower data latency from days to minutes/seconds

© 2012 IBM Corporation 13

IBM Warehousing & Analytics

– Offering Positioning…”It Depends”

CUSTOMER Preferences

1. High performance analytic queries and real-time transactions are both required

2. Power Systems, Linux/x series, or System z platform

3. Consistent use of DB2 across IT environment

IBM Netezza

(Appliances)

IBM Smart Analytic System (Optimized systems)

IBM InfoSphere Warehouse (Custom configurations)

CUSTOMER Preferences

1. High performance analytic queries without DBA tuning

2. No storage administration 3. Fastest possible deployment

© 2012 IBM Corporation 14

Platforms: System z, Power Systems, System x, Systems Storage

System z Power Systems

System x

Strategic Objectives

Drive growth through new workloads: consolidation, analytics, and hybrid computing (zBX)

Expand client base through competitive takeouts and focus on new clients in Growth Markets

Strategic Objectives

Aggressively continue to gain share from HP and Oracle/Sun with Power migration programs

Establish Power as premier platform to execute Cloud, Analytics, and Smarter Planet hypergrowth

Strategic Objectives

Drive IBM Stack growth – integrated with Cloud, Analytics, Smarter Planet

Drive improved x86 value capture model

Systems Storage

Strategic Objectives

Drive differentiation through storage efficiency and data protection

Execute brand transformation plans

© 2012 IBM Corporation 15

IBM InfoSphere Warehouse

IBM Smart Analytics System

IBM Netezza

Flexibility Simplicity The right mix of simplicity and flexibility

Simplicity, Flexibility, Choice IBM Data Warehouse & Analytics Solutions

Information Management Portfolio

(Information Server, MDM, Streams, etc)

Warehouse Accelerators

Flexible Integrated System True Appliance Custom Solution

© 2012 IBM Corporation 16

Foundation Start with a single Foundation Module, the starting common foundation

Scalability and Failover For additional data handling capacity, number of users or failover functionality, add additional nodes

BI and Analytics InfoSphere Warehouse and Cognos BI modules

1 Module 1 to x Modules 0 to y Modules 0 or x/5 Modules

Choose the way that your data warehouse solution develops. Simply start with any foundation and just add modules as you require.

Core Warehouse Modules Application Modules

Foundation Module

Data Module

User

Module

Failover Module

+ + Warehouse

Applications Module

Business Intelligence

Module

IBM Smart Analytics System Transparent modular architecture

© 2012 IBM Corporation 17 © 2012 IBM Corporation

Data Warehouse

Data Mart ODS

PureData

DS8870

Analytics Accelerator

DB2

ETL/ELT

Operational Source Systems

Or AIX

Or z/OS

Organized for simplicity and functionality

17

© 2012 IBM Corporation 18 © 2012 IBM Corporation

ETL/ELT

Data Mart3

DB2

Centralized Control of Decision Information Fast, Consistent, Easily Managed Information

Data Mart

2

WEB Applications

Analytic Applications

Business Performance Applications

Data

Warehouse

Centrally managed

Consistent information

Easy to access

Easy to update

Fast business recovery

Simplified administration

Maximize business value from resources

Analytics Accelerator

Data Studio

© 2012 IBM Corporation 19

10-100x faster than traditional custom systems4

20x greater concurrency and throughput for tactical queries than previous Netezza technology5

Pattern based database deployment in minutes, not hours1

Handles more than 100 databases on 1 system2

IBM PureData System

Continuous ingest of operation data

Handles 1000+ concurrent operational queries3

Up to 10x storage savings with adaptive compression6

System for Transactions

System for Analytics

System for Operational Analytics

powered by Netezza technology

1. Based on IBM internal tests and system design for normal operation under expected typical workload. Individual results may vary. 2. Based on one large configuration 3. Based on IBM internal tests of prior generation system, and on system design for normal operations under expected typical workload. Individual results may vary. 4. Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary. 5. Based on IBM internal performance benchmarking 6. Based on client testing is the DB2 10 Early Access Program

© 2012 IBM Corporation 20

IBM Smart Analytics Advantages

+ + + + +

SI by You

Models Cleansing ETL MDM Data Warehouse BI

+ + + + + + +

SI by IBM

Models

BDW

Cleansing

InfoSphere

ETL

InfoSphere

MDM

IBM MDM Server

Data Warehouse

Smart Analytics

Server

BI

Cognos

Unified Infrastructure Benefits: Decreased risk by 53% Improved business alignment 83% Improve time to value by 75% Reduce project staffing by 90% I

BM RESEARCH/ANALYST REPORTS

© 2012 IBM Corporation 21

The IBM Big Data Platform

InfoSphere BigInsights

Hadoop-based low latency analytics for variety and

volume

Data-At-Rest

Netezza High Capacity Appliance

Queryable Archive for Structured Data

Netezza 1000

BI+Ad Hoc Analytics on Structured Data

Smart Analytics System

Operational Analytics on Structured Data

Informix Timeseries

Time-structured analytics

InfoSphere Warehouse

Large volume structured data analytics

InfoSphere Streams

Low Latency Analytics for streaming data

Velocity, Variety & Volume

Data-In-Motion MPP Data Warehouse

Stream Computing

Information Integration

Hadoop

InfoSphere Information Server

High volume data integration and transformation

Big Data Concepts and Hardware Considerations Apache Hadoop:

open source framework for the distributed

processing of large data sets across clusters of computers using a simple programming

model

© 2012 IBM Corporation 22

The IBM Big Data Platform

Big Data Concepts and Hardware Considerations

Integrate and manage the full variety, velocity and volume of data

Apply advanced analytics to information in its native form

Visualize all available data for ad-hoc analysis

Development environment for building new analytic applications

Workload optimization and scheduling

Security and Governance

© 2012 IBM Corporation 23

IBM’s Value: Complementary Analytics

Traditional Approach Structured, analytical, logical

New Approach Creative, holistic thought, intuition

Structured Repeatable

Linear

Monthly sales reports Profitability analysis

Customer surveys

Internal App Data

Data Warehouse

Traditional Sources

Structured Repeatable

Linear

Transaction Data

ERP data

Mainframe Data

OLTP System Data

Unstructured Exploratory Iterative

Brand sentiment Product strategy Maximum asset utilization

Hadoop Streams

New Sources

Unstructured Exploratory

Iterative

Web Logs

Social Data

Text Data: emails

Sensor data: images

RFID

Enterprise Integration

Big Data Concepts and Hardware Considerations

© 2012 IBM Corporation 24

The Big Data Ecosystem: Interoperability is Key

Streaming Data

Traditional Warehouse

Analytics on Data at Rest

Data Warehouse

Analytics on Structured

Data

Analytics on Data In-Motion

InfoSphere BigInsights

Traditional / Relational

Data Sources

Non-Traditional / Non-Relational Data Sources

Non-Traditional/ Non-Relational Data Sources

Traditional/Relational Data Sources

Internet-Scale

Data Sets

InfoSphere Streams

© 2012 IBM Corporation 25

On a Smarter Planet, technology innovation redefines industries

Trading

Traffic Control

Fraud Prevention

Law Enforcement

© 2012 IBM Corporation 26

Netezza and Industry Models

● Industry strength of DW models plays to typical Netezza vertical approach

● Use Netezza as the basis for any Dimensional structures generated from traditional Data Warehouse models

● Enables models to be deployed to leverage the traditional Netezza Strengths

● Aligns with typical usage/topology for Netezza

● Generate DDL from IDA and customize Distribution clause to run in Netezza

© 2012 IBM Corporation 27

Where Does a Data Warehouse Fit in the IT Environment?

Content

Structured Data

Analyze Integrate

Govern

Master Data

Data

Transactional & Collaborative Applications

Manage

Streaming Information

Business Analytic Applications

Streams

Big Data

Data Warehouses

External Information

Sources

www

Quality

Lifecycle Management

Security & Privacy

© 2012 IBM Corporation 28

Netezza DW environment

© 2012 IBM Corporation 29

The Business Solution templates can be deployed directly onto a Netezza DW

environment

Pro

fita

bili

ty

Ris

k

Managem

ent

Regula

tory

C

om

plia

nce

Data Sources

● The Corporationschooses the required structures to address their specific business needs from the 145 pre-defined Business Solution Templates

● Parallel projects can select from different areas to ensure consistency of reporting across the enterprise

© 2012 IBM Corporation 30

The BDW Business Solution templates can be deployed in a “Conformed Dimension

configuration – all on a Netezza DW environment

Pro

fita

bili

ty

Ris

k

Managem

ent

Regula

tory

C

om

plia

nce

Conformed Dimension

Layer

Data Sources

● Different Reporting areas can share a “Conformed Dimension Layer”

● Ensures consistency of Dimensional structures such as “Customer”, “Product”, “Time” across the enterprise

● This means that a Financial Institution can build up a cross-enterprise dimensional data warehouse over time in small business focused bite-sized chunks … all on Netezza !

© 2012 IBM Corporation 31

© 2012 IBM Corporation 32

15%

Netezza client base examples

Page 32

Digital Media

Financial Services

Government

Health & Life

Sciences

Retail / Consumer

Products

Telecom

Other