data warehousing and business intelligence - the...

99
Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates. Oracle Data Warehousing and Business Intelligence - The Complete Story Andreas Katsaris Arisant, LLC

Upload: hoangkhanh

Post on 28-Apr-2018

225 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Oracle Data Warehousing and

Business Intelligence -

The Complete Story Andreas Katsaris

Arisant, LLC

Page 2: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

The complete picture…

Siebel CRM

Oracle EBS PeopleSoft SAP/R3

Other Sources

Oracle Data Integrator

E-LT Metadata E-LT Agent

Oracle BI Enterprise Data

Warehouse

Oracle BI Suite EE

Oracle BI Server

Oracle BI Presentation Server

Answers Interactive Dashboards Publisher Delivers

Bulk E-LT

Page 3: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Agenda •  Warehouse definition

–  Why Bother? –  Life before Integrated warehousing and BI –  Terminology

•  Warehouse design aspects and considerations –  Methodology –  Dimensional Modeling –  Top Most powerful and underutilized database features

•  Oracle Business Intelligence (OBI) –  OBI Components and descriptions

•  Oracle Data Integrator (ODI) –  ETL Strategy –  Architecture

Page 4: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Warehouse Mission

• Gather, integrate and reconcile operational, decision support, and external data

• Provide meaningful, accessible, consistent, and easy to understand business information to enterprise users

• Act as a single integrated source of data for processing information

Page 5: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

What is a Data Warehouse?

• a relational database designed for query and analysis

• contains historical data derived from transaction data, but it can include data from other sources

• enables an organization to consolidate data from several sources

Page 6: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

What is a Data Warehouse?

• Components of a warehouse environment – extraction, transformation, and loading (ETL)

engine – online analytical processing (OLAP) solution – client analysis tools/Data Mining – other applications that manage the delivery of

data to enterprise users

Page 7: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Warehouse Methodology •  Methodology

–  Iterative/Agile process, avoid waterfall – ‘big bang’ approach –  Start with one Subject Area and one target user group

•  Scoping Phase -  Identify key players (business sponsors, stakeholders, users) -  Build and validate the business case

•  Show ROI benefits -  Support for the following types of functionality

-  Activities that cannot be done at all today, but should be -  Activities that are accomplished today by manually synthesizing

data from reports, file extracts, and other sources •  Evaluate candidate technologies for the ETL, data warehouse and front-

end tools –  Today we are confident that we can do all this with Oracle

technologies!

Page 8: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Warehouse Architectures

• Data Warehouse Architecture (Basic) • Data Warehouse Architecture (with Staging

Area) • Data Warehouse Architecture (with Staging

Area and Data Marts)

Page 9: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Warehouse Architecture (Staging Area and Data Marts)

End Users

Data Sources:Databases,flat files spreadsheets, XML …

Warehouse Staging Area

Extraction

Data Mart

Transformation & Load

Transformation & Load

KNOWLEDGE

INFORMATION DATA Managed

Reporting

Ad-Hoc Queries

Balanced Scorecards

BU

S. INTELLIG

ENC

E DELIVERY

Notification

Analysis

Dashboards

Page 10: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Ultimate Goal • Timely, Accurate, Fast delivery of knowledge

Star Schema with

Mviews (rewrite, refresh) Star Schema

3NF

OLAP cubes

(rewrite, refresh)

Better Performance

Page 11: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Top Most powerful and underutilized database features

•  Data Sourcing

•  ETL at database level

•  Maintenance

•  Performance

Page 12: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Page 13: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Sourcing -  Replication -  Transportable Tablespaces -  Streams -  CDC (Change Data Capture) -  External Tables (access to flat files) -  Data Pump -  SqlLoader (access to flat files)

Page 14: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Replication

• Snapshot logs on source system • Snapshot technology can be extended to

produce/preserve exact row changes • Resistance from source system owners

– usually complain about performance impact – space requirements for snapshot logs

Page 15: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Transportable Tablespaces

•  copying of datafiles (tablespaces in read only mode) • move both table and index data avoids index rebuilds • Cross platform support • When to consider:

–  Lots of source system changes –  Required tables can be placed in specific tablespace(s) –  No timestamps so incremental extraction not possible –  Other options not allowed (streams, replication, CDC)

Page 16: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Streams •  routes published information to subscribed destinations •  only changes to desired objects are captured •  Architecture

–  Capture –  Staging/Propagation –  Consumption –  Parallel processing of log files

•  Non-intrusive –  No triggers or snapshot logs –  uses redo log/archive log

•  Custom Transformations possible •  Access via Oracle-supplied PL/SQL packages or Enterprise

Manager Console •  Golden Gate acquisition

Page 17: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Change Data Capture •  simplifies the process of identifying changed data since the

last extraction (incremental extraction) •  architecture is based on publisher-subscriber model •  synchronous (part of transaction)

–  change data is captured via triggers and stored inside the database in change tables

•  asynchronous (not part of transaction) –  lightweight Oracle Streams application –  Changes extracted from the log files

•  Captured data made available to the target systems in a controlled manner (subscription window), using database views

Page 18: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

External Tables •  Files stored outside the database • Read only in 9i – Read/Write in 10g(CTAS) –

compressed and encrypted in 11g • DML/index creation not allowed • Can be read with SQL as if they were tables

– May be joined with database tables – Filtering allowed with WHERE clause

• Can be used to load staging area or act as a staging area themselves

Page 19: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Data Pump •  Next generation export/import tools •  Various interfaces

–  expdp / impdp –  Web based GUI via Database Control –  DBMS_DATAPUMP

•  Jobs (exports and loads) are interruptible and resumable •  Parallelism •  Fine-grain object selection •  Allows data movement via db links!

–  Works in a pipelined fashion •  Compressed and encrypted in 11g

Page 20: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

• Resumable Operations • PL/SQL

– Direct Path Operations – Transformation and Loading functions – BULK operations

ETL at the database level

Page 21: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Resumable Operations •  Instance level

–  init.ora: RESUMABLE_TIMEOUT = 3600 •  Session level

–  ALTER SESSION ENABLE RESUMABLE <TIMEOUT secs>; –  ALTER SESSION DISABLE RESUMABLE;

•  Used for: –  DDL

•  CREATE INDEX, CTAS … –  DML

•  INSERT/UPDATE

•  Detection: –  Look in DBA_RESUMABLE where STATUS='SUSPENDED'

Page 22: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

PL/SQL • Easy to write, flexible, fast deployment • Extraction/Loading using Direct Path

Operations –  CTAS operations with NOLOGGING –  INSERT … SELECT with APPEND and PARALLEL hints –  Ideal for incremental extractions (e.g. timestamp based)

• Transformations/Loading –  Pipelined functions, MERGE, Multitable INSERT, INSERT

FIRST, pivot, myriad of Analytical functions

Page 23: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

PL/SQL

• BULK operations – BULK COLLECT

• bulk binds with SELECT statements • fetch into user defined array or PL/SQL table

– FORALL • bulk binds with INSERT, UPDATE, and DELETE

statements – RETURNING INTO

• Retrieving DML Results into a Collection

Page 24: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

DML Error Logging •  Rows causing errors are loaded in an error log table •  Alternative to using PL/SQL with the SAVE EXCEPTIONS

clause BEGIN DBMS_ERRLOG.CREATE_ERROR_LOG('SALES_FACT'); END; /

INSERT /*+ APPEND */ INTO sales_fact SELECT * FROM <source tables> LOG ERRORS REJECT LIMIT UNLIMITED ;

Page 25: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

• Maintenance – DML/DDL Operations using :

• Direct Path Operations (NOLOGGING, append hint)

• Parallelism • Partitioning • ILM (Oracle ILM tool) • Advanced Compression

Maintenance

Page 26: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Performance • Dimensional modeling/star schemas • Partitioning • Parallelism • Deferred/disabled constraints •  Index Management (unuse/rebuilt) • Advanced compression (test first) • Materialized Views • OLAP Cubes/Cube Organized Materialized Views • Data Caching

Page 27: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Star Schema •  FACT table

–  ‘skinny’ and large –  Contains numeric measurements and FKs to dimensions –  FK columns – bitmap indexes –  Consider partitioning and archiving aspects

•  Dimensions –  Denormalized – small and ‘wide’ –  Contain business attributes relating to the measurements

•  Avoid snowflake if possible

Page 28: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Star Schema

FACT SALES

DIMENSION 1 REGION

DIMENSION 2 PRODUCT

DIMENSION 4 ENTITY

DIMENSION 3 TIME

•  Intuitive to end-users •  Mirrors the way people think about the business •  High Query Performance

Page 29: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Direct Path Operations • NOLOGGING option, APPEND hint • CTAS … NOLOGGING •  INSERT /*+ append */ INTO … SELECT …

– exclusive locks on the table (no parallel streams, use PARALLEL hint instead)

– No enabled FKs (hint ignored) –  Index maintenance deferred to end to direct

path insert operation • CREATE INDEX …. NOLOGGING;

Page 30: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Parallelism • Queries requiring large table scans, joins, or

partitioned index scans • Creation of large indexes • Creation of large tables (including

materialized views) • Bulk inserts, updates, merges, and deletes • Understand related init.ora parms and set

appropriately

Page 31: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Parallelism •  Types:

–  Parallel Query –  Parallel DDL –  Parallel DML

•  Implementation: –  via a hint

/*+ PARALLEL(<table name>,<degree of parallelism>) */

–  at the object definition level • CREATE/ALTER TABLE finance_trans PARALLEL 4; • CREATE/ALTER INDEX … PARALLEL 2;

Page 32: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Partitioning • Goal

–  Rolling window operations –  Increased performance

•  Types –  Range Partitioning –  Hash Partitioning –  List Partitioning –  Composite Partitioning

•  FACT tables are candidates for partitioning

Page 33: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Partitioning • Partition Pruning (a.k.a partition elimination)

–  evaluate WHERE clause to eliminate unneeded partitions

–  Reduces unnecessary I/O –  Improved query performance

• Partition operations to facilitate warehouse loads –  Load multiple partitions in parallel –  Load stand alone table - EXCHANGE partition –  Index maintenance at the partition level –  Archive old data using EXCHANGE partition

Page 34: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Partitioning- 11g enhancements • Partition Advisor • New composite combinations

–  list/range, range/range, list/hash, list/list •  Interval Partitioning

–  Automatic creation of range-based partitions • REF Partitioning

–  Partition detail table based on the master-table key • Virtual-Column Based Partitioning

–  Partition based on an expression

Page 35: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Interval Partitioning • Minimizes periodic partition maintenance (No need to

create new partitions) • Partition segments allocated as soon as new data

arrives • Local indexes are created and maintained as well • Requires at least one range partition • Range key value determines the range high point • Partitioning key can only be a single column, and either

DATE or NUMBER datatype

CREATE TABLE sales(…) PARTITION BY RANGE (sales_date) INTERVAL(NUMTOYMINTERVAL(1, 'MONTH')) ( PARTITION p1 VALUES LESS THAN (TO_DATE('1-2-2006', 'DD-MM-YYYY')) );

Page 36: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

REF Partitioning •  Related tables benefit from same partitioning strategy •  Partition a table based on a column in another table •  Partition key inherited via FK/PK relationship •  Avoids having the partition key on the child table

CREATE TABLE orders /* parent table */ ( order_id /* PK */…..order_date /* partition key*/… …CONSTRAINT orders_pk PRIMARY KEY(order_id) ) PARTITION BY RANGE(order_date)…

CREATE TABLE order_items /* child table */ ( order_id /* FK */… CONSTRAINT order_items_fk ..FOREIGN KEY(order_id) REFERENCES orders(order_id) ) PARTITION BY REFERENCE (order_items_fk);

Page 37: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Virtual-Column Based Partitioning • Add expression based virtual (meta data) columns • Virtual column used as the partition key

CREATE TABLE sales ( sales_num NUMBER(12), rep_name VARCHAR2(20), sales_code NUMBER(2) GENERATED ALWAYS AS TO_NUMBER (SUBSTR(TO_CHAR(sales_num),1,2)) ) PARTITION BY LIST (sales_code)

Page 38: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Advanced Compression •  Table Compression

–  Table Scan Performance: 2x faster –  Storage Savings: 2x smaller –  DML Performance: 5% slower

CREATE TABLE SALES_FACT (…) COMPRESS FOR ALL OPERATIONS;

•  RMAN Compression –  ~40% faster than compressed backups in 10g –  Slightly better compression ratio than in 10g

RMAN> CONFIGURE COMPRESSION ALGORITHM ‘zlib’; RMAN> backup as COMPRESSED BACKUPSET database archivelog all;

•  Data Pump Compression expdp hr FULL=y DUMPFILE=dpump_dir:full.dmp COMPRESS;

Page 39: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Advanced Compression

Page 40: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

ILM – Information Lifecycle Management

• Why Bother – Compliance – Performance – Cost – Data Maintenance

5% - most active

35% - less active

60% - historical

Page 41: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

ILM – Information Lifecycle Management

• Implementation of different tiers of storage • Consider Oracle ILM Assistant (free!)

– Leverages Oracle Partitioning – Uses Lifecycle Definitions – Calculates storage costs & savings – Simulates the impact of partitioning on a table – Advises how to partition a table – Generates scripts to move data when required

Page 42: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

SQL Query Result Cache

• Caching of query results or PL/SQL function calls • DML/DDL against dependent database objects

invalidates cache • Candidate queries

–  access many, many rows –  return few rows (small result set) –  executed many times

Page 43: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

SQL Query Result Cache •  result_cache_mode init.ora parameter

–  AUTO (optimizer uses repetitive executions to determine if query will be cached)

–  MANUAL (need use /*+ RESULT_CACHE */ hint in queries) –  FORCE (All results are stored in cache)

•  result_cache_max_size init.ora parameter –  default is dependent on other memory settings –  (0.25% of memory_target or 0.5% of sga_target or 1% of

shared_pool_size) –  0 disables result cache –  never >75% of shared pool (built-in restriction)

Page 44: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Optimizing Querying

• Star Queries • Materialized Views • Cube Organized Materialized Views • Index Merge • Bitmap Join Indexes

Page 45: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Star Queries •  A join between a fact table and a number of dimension

tables •  Require the existence of bitmap indexes on FK columns on

the FACT table •  retrieve exactly the necessary rows from the fact table using n merged bitmap indexes

•  join this result set to the dimension tables •  Implementation:

–  star_transformation_enabled=true|false|temp_disable (system or session modifiable)

–  hints for ‘stubborn’ queries •  /*+ STAR_TRANSFORMATION */ •  /*+ FACT */

Page 46: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Materialized Views •  Pre-aggregation of data •  Changes to the underlying tables reflected in mview

–  Fast/Complete refresh –  Refresh can be on demand or scheduled

•  Can be partitioned and indexed like a table •  Query Rewrite

–  /*+ REWRITE_OR_ERROR */ hint –new in 10g (ORA-30393) –  DBMS_MVIEW.EXPLAIN_REWRITE() to see why rewrite failed –  $ORACLE_HOME/rdbms/admin/utlxrw.sql for REWRITE_TABLE (explanation)

•  DBMS_ADVISOR package –  APIs to recommend materialized views and indexes –  DBMS_ADVISOR.TUNE_MVIEW()

•  API to tune existing materialized views and indexes •  Tuned version in USER_TUNE_MVIEW

Page 47: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Materialized Views • Query rewrite makes mviews transparent to end-users

Base tables Snapshot logs

INSERT/

UPDATE/

DELETE

Query (report, ETL code, etc)

Create materialized view … enable

query rewrite …

Select ….

Without query rewrite

With query rewrite

Refresh mode ?

Page 48: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Cubes and Cube Organized MViews

• Query Options – Query star schema (fact+dimensions) – Build mviews on star schema

• Too many combinations

– Build cubes

Page 49: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Cubes and Cube Organized MViews

• Stored in special db areas called Analytic Workspaces (which are stored in BLOBs)

• Manipulated via Analytic Workspaces Manager tool • 11g- Best of both worlds

–  Rewrite and refresh features of regular MVs –  Performance benefits of OLAP cubes –  Expose OLAP cube as a relational object access via SQL –  CUBE_TABLE function searches cube using SQL

Page 50: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Cubes and Cube Organized MViews

SQL> explain plan for 2 select * from table(cube_table('GLOBAL.PRICE_CUBE'));

PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------- Plan hash value: 3184667476

-------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2000 | 195K| 29 (0)| 00:00:01 | | 1 | CUBE SCAN PARTIAL OUTER| PRICE_CUBE | 2000 | 195K| 29 (0)| 00:00:01 | --------------------------------------------------------------------------------------

•  Query cube as if it was a relational object

• Refresh using DBMS_MVIEW.REFRESH

Page 51: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Merge

• Merge two separate indexes • Avoid creating a new concatenated index • Implementation via a hint

–  /*+ index_join(table_name,index1, index2) */

Page 52: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Merge

SQL> select count(*) from FINANCE_TRANS where COMPANY_KEY between 1 and 1000 and CHRGE_CODE

= 'BASIC';

Elapsed: 00:03:44.03

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5147 Card=1 Bytes=11 )

1 0 SORT (AGGREGATE)

2 1 TABLE ACCESS (BY INDEX ROWID) OF ‘FINANCE_TRANS' (Cost=5147 Card=10004 Bytes=110044)

3 2 INDEX (RANGE SCAN) OF 'FINANCE_TRANS_IX2' (NON-UNIQUE) (Cost=51 Card=100035)

Page 53: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Merge

SQL> select /*+ index_join(FINANCE_TRANS, FINANCE_TRANS _IX2,TEMP1) */ count(*) from FINANCE_TRANS where COMPANY_KEY between 1 and 1000 and CHRGE_CODE='BASIC';

Elapsed: 00:00:01.73

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2037 Card=1 Bytes=11)

1 0 SORT (AGGREGATE)

2 1 VIEW OF 'index$_join$_001' (Cost=2037 Card=10004 Bytes=110044)

3 2 HASH JOIN

4 3 INDEX (RANGE SCAN) OF ' FINANCE_TRANS _IX2' (NON-UN IQUE) (Cost=1811 Card=10004 Bytes=110044)

5 3 INDEX (RANGE SCAN) OF 'TEMP1' (NON-UNIQUE) (Cost=1811 Card=10004 Bytes=110044)

Page 54: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Bitmap Join Indexes •  An index build on a table using columns from another table(s)! •  Index contains the data to support a join query •  Allows the query to retrieve the data from the index rather than

referencing the join tables

•  Good for query performance, bad for DML operations –  Consider drop/recreate

CREATE BITMAP INDEX temp_bixj1 ON BUDGET_FACT

(LEDGER_DIM.ledger_name)

FROM BUDGET_FACT, LEDGER_DIM

WHERE LEDGER_DIM.ledger_key = BUDGET_FACT.ledger_key

NOLOGGING

TABLESPACE INDEX_1;

Page 55: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Bitmap Join Indexes -- Without bitmap join index

SQL> SELECT D.ledger_name, SUM(F.base_amount) FROM BUDGET_FACT F, LEDGER_DIM D

WHERE D.ledger_key = F.ledger_key AND D.ledger_name = ‘Internal Spending’

GROUP BY D.ledger_name;

Elapsed: 00:00:07.58 Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4667 Card=1 Bytes=29 )

1 0 SORT (GROUP BY NOSORT) (Cost=4667 Card=1 Bytes=29)

2 1 HASH JOIN (Cost=4667 Card=1577907 Bytes=45759303)

3 2 VIEW OF 'index$_join$_002' (Cost=97 Card=6125 Bytes=128625) 4 3 HASH JOIN

5 4 INDEX (RANGE SCAN) OF ' LEDGER_DIM_IX4' (NON-UNIQUE ) (Cost=6 Card=6125 Bytes=128625)

6 4 INDEX (FAST FULL SCAN) OF 'LEDGER_DIM_PK' (UNIQUE)(Cost=6 Card=6125 Bytes=128625)

7 2 TABLE ACCESS (FULL) OF ‘BUDGET_FACT' (Cost=2631 Card=8092591 Bytes=64740728)

Page 56: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Bitmap Join Indexes -- Add bitmap join index TEMP_BIXJ1

Elapsed: 00:00:00.72

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1875 Card=1 Bytes=29 )

1 0 SORT (GROUP BY NOSORT) (Cost=1875 Card=1 Bytes=29)

2 1 HASH JOIN (Cost=1875 Card=1577907 Bytes=45759303)

3 2 VIEW OF 'index$_join$_002' (Cost=97 Card=6125 Bytes=128625)

4 3 HASH JOIN

5 4 INDEX (RANGE SCAN) OF ‘LEDGER_DIM_IX4' (NON-UNIQUE) (Cost=6 Card=6125 Bytes=128625)

6 4 INDEX (FAST FULL SCAN) OF ‘LEDGER_DIM_PK' (UNIQUE)(Cost=6 Card=6125 Bytes=128625)

7 2 TABLE ACCESS (BY INDEX ROWID) OF ‘BUDGET_FACT' (Cost=1712 Card=8092591 Bytes=64740728)

8 7 BITMAP CONVERSION (TO ROWIDS)

9 8 BITMAP INDEX (SINGLE VALUE) OF 'TEMP_BIXJ1'

Page 57: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Management •  Why is it a big deal in a warehouse? •  Index maintenance during DML

–  Benchmark loads with/without indexes present –  Specific attention to bitmap index maintenance –  Consider invalidate/rebuild or drop/recreate –  Partition level maintenance provides granular control –  Provide APIs (stored procs) to ETL application

•  Virtual Indexes! – undocumented feature –  Not intended for standalone usage –  Part of Tuning Pack’s Virtual index Wizard –  Allows CBO to evaluate a new index without having to create it!

•  Index monitoring (watch out from DBMS_STATS)

Page 58: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Management - Case Study – Loading in non partitioned table

• 1.2 billion existing records and 3 indexes • Load 100 million records • Last 30% of records require some lookups on first

70% of loaded records (1 of the indexes is needed)

– Slow load performance –  Indexes can be dropped to improve loading

• Need to be recreated • Not available for other queries

Page 59: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Management - Case Study

No partitioning

Page 60: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Index Management - Case Study • The solution

– Partition the table and indexes –  Invalidate specific index partitions (mark

unusable) • No inline index maintenance while data is loading • Rebuild required index partition to help last part of the

load (granular control) – Rebuild remaining index partitions – Other partitions and related indexes available for

querying while loading new data

Page 61: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Unusable partitioned Indexes: No inline index maintenance

Usable partitioned Indexes: Rebuilt by the application using stored procedures after a load has been completed

Index Management - Case Study Partitioned table and indexes

P2 P3 P4 P5 P1 P6

Page 62: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Constraint Management • Do I need foreign keys in a warehouse?

– My code takes care of data integrity –  Our data is clean

•  Trace to see what happens while loading • Consider /*+ append */ hint behavior • Consider some of these options

–  Deferred constraints • check that constraint is satisfied only at commit time • Useful when loading in no particular order

–  Disable/enable, validate/novalidate –  Enable in parallel!

• Metalink Note:124848.1

Page 63: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Statistics Management • Be careful what/when you analyze • Manage statistics on ‘big’ tables

–  Do I really need to analyze all the tables? –  Determine frequency/timing, estimate sample –  Consider ‘faking’ statistics –  Consider providing APIs to ETL application

• Ability to analyze before/after certain loads

•  Index monitoring – Watch out from dbms_stats.gather_table_stats and

dbms_stats.gather_index_stats behavior • Procedures mark analyzed indexes as USED!

Page 64: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Oracle Business Intelligence EE

Page 65: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

From this…

Page 66: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

To this…

Page 67: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Gartner Magic Quadrant for BI Platforms 2008 2009

Page 68: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Traditional, non-Integrated Solution

Mining Engine

- Analysis

OLAP Engine

-Analytic Apps

Reporting Engine

Portal

ETL

ETL

Page 69: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

N0n-Integrated Solution • Siloed reports, multiple islands of information • No knowledge sharing • Duplication of effort and reports • Security issues • Different formats and tools, local data copies • Access and distribution is problematic

Page 70: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Oracle’s Approach

• Integrated BI Database –  BI, Data Mining and OLAP functions in the database

• Integrated BI Tools – Common BI Technology Platform

• Integrated Analytic Applications –  Industry Specific pre-build Apps

Page 71: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Comprehensive DW and BI stack (source - Oracle)

Oracle OLAP Oracle Data Mining Oracle Partitioning OWB/ODI

Oracle BI Suite Standard Edition Oracle BI Suite Enterprise Edition

Oracle BI Applications

Oracle Daily Business Intelligence

Oracle Corporate Performance Management

PeopleSoft Enterprise Performance Management

Oracle Real-Time Decisions

Page 72: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Pipeline Analysis

Triangulated Forecasting

Sales Team Effectiveness

Up-sell / Cross-sell

Discounting Analysis

Lead Conversion

Receivables / Payables Analysis

Customer Profitability

Product Profitability

Regulatory Compliance

Expense Management

Cash Flow Analysis

Churn Propensity

Customer Satisfaction

Resolution Rates

Service Rep Effectiveness

Service Cost Analysis

Service Trends

Campaign Scorecard

Response Rates

Product Propensity

Loyalty and Attrition

Market Basket Analysis

Campaign ROI

Supplier Performance

Inventory Analysis

Procurement Cycle Times

Inventory Availability

Employee Expenses

BOM Analysis

Employee Productivity

Compensation Analysis

Compliance Reporting

Workforce Profile

Turnover Trends

Return on Human Capital

Oracle BI Applications (source - Oracle)

Sales Analytics

Financial Analytics

Workforce Analytics

Supply Chain Analytics

Service & Contact Center

Analytics Marketing Analytics

Travel & Trans

Auto Comms. & Media

Complex Mfg.

Consumer Sector

Energy Financial Services

High Tech

Insurance & Health

Life Sciences

Public Sector

Other Operational & Analytic Sources Prebuilt adapters: Siebel Oracle SAP PeopleSoft

Page 73: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

OBI Apps (source - Oracle)

Page 74: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

OBI Suite Offerings Enterprise Edition Plus (OBI EE Plus) •  BI Server •  BI Answers •  BI Interactive Dashboards •  BI Publisher (a.k.a XML Publisher) •  BI Delivers •  BI Disconnected Analytics •  MS Office Add-In •  Hyperion Source: http://www.oracle.com/technology/products/bi/enterprise-edition.html

Page 75: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

OBIEE Answers –Ad Hoc Querying

Page 76: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

OBIEE Interactive Dashboards

Page 77: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

BI Publisher – “Pixel-Perfect” Reporting

Page 78: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

BI Server Overview •  Integrates information from various sources

–  Native RDBMS support for Oracle, SQL Server, DB2 and Teradata •  Sophisticated data access, aggregation, and calculation

engine •  Three separate layers

–  Physical –  Business Model –  Presentation

•  Mapping of physical data schemas (tables & joins) to a logical business model

•  Security services •  Represent multiple physical data sources as a single,

simplified data structure to end user tools

Page 79: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

BI Server Overview – Admin Tool

Page 80: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

BI Server – Physical Layer

• exact depiction of the target physical objects

• mapping of physical data schemas (tables, joins) to a logical business model

Page 81: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

•  Layer of abstraction that sits on top of the physical data –  Simplifies complex database structures

•  ‘firewall’ between users and the mechanics of the physical data access layer

•  Drill-Paths –  e.g. Year->Qtr->Month->Week->Day

•  Derived Business Measures or Calculations –  e.g. support for time-series analysis, moving

averages, weighted averages, rolling sums, cumulative calculations, etc.

BI Server – Business Model Layer

Customer

Page 82: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

• Sits on top of Business Model • Can build many views into the

Business Model: – simplified Presentation for

end-users – complex Presentation for

power-users/report writers (complex calculations, etc.)

BI Server – Presentation Layer

Page 83: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Oracle Data Integrator (ODI) •  Sunopsis Acquisition •  ‘Oracle friendly’ architecture (leverages RDBMS) •  Data Movement and Transformation from

Multiple Sources to Heterogeneous Targets •  Key Differentiators

–  Transformations leverage RDBMS –  Declarative Design (automatically generates the Data

Flow whatever the sources and target DB) –  Knowledge Modules

Page 84: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Oracle Data Integrator Architecture •  Java design-time environment

–  Runs on any platform –  Thin client for browsing

Metadata •  Java runtime environment

–  Runs on any platform –  Orchestrates the execution

of data flows •  Metadata repository

–  Pluggable on many RDBMS –  Ready for deployment –  Modular and extensible

metadata

Design-Time

Metadata Management

Runtime

Agent

Data Flow Conductor

Service Interfaces and Developer APIs

User Interfaces

Thin Client

Data Flow Generator

Knowledge Module Interpreter

Knowledge Modules

Master Repository

Work Repositories

Runtime Repositories

Data Flow Generator

Runtime Session

Interpreter

Data Flow

Operator

Designer

Page 85: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Terminology •  ETL/ELT projects are designed in the Designer tool •  Transformations in ODI are defined in objects called Interfaces •  Interfaces are stored into Projects •  Interfaces are sequenced in a Package that will be ultimately

compiled into a Scenario for production execution

Page 86: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Interface •  An Interface will define

–  Where the data is sent to (the Target) –  Where the data is coming from (the Sources) –  How the data is transformed from the Source format to the Target

format (the Mappings) –  How the data is physically transferred (the path taken) from the

sources to the target (the data Flow) •  Source and target are defined using Metadata imported from the

databases and other systems •  Mappings are expressed in SQL •  Flows are defined in Templates called Knowledge Modules (KMs) •  An interface may have more than one source but only populates a

single target •  To populate several targets, you need several interfaces

Page 87: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Staging Area • A separate, dedicated area in an RDBMS where ODI creates its temporary objects and executes some of the transformation rules • By default, ODI sets the staging area on the target data server

-  can be on the source or a 3rd RDBMS or the Sunopsis Memory Engine - cannot be placed on non relational systems (Flat files, ESBs, etc.)

Page 88: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Metadata • ODI is strongly based on the relational paradigm •  In ODI, data are handled through tabular structures

defined as datastores • Datastores are used for all type of “real” data

structures: database tables, flat files, XML files, JMS messages, LDAP trees, etc

•  The definition of these datastores (the metadata) will be used in the tool to design the data integration processes.

• Defining the datastores is the starting point of any data integration project

Page 89: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Two Methods for Reverse Engineering •  Standard reverse-engineering

–  Uses JDBC connectivity features to retrieve metadata, then writes it to the ODI repository

•  Customized reverse-engineering –  Read metadata from the application/database system

repository, then writes these metadata in the ODI repository

–  Uses a technology-specific strategy, implemented in a Reverse-engineering Knowledge Module (RKM)

Page 90: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Knowledge Modules - KMs •  Knowledge Modules are templates of code that define

integration patterns and their implementation •  They are usually written to follow Data Integration best

practices, but can be adapted and modified for project specific requirements

•  Example steps of a KM: •  When loading data from a heterogeneous environment:

1.  create a staging table 2.  to load the data, use SQL loader 3.  create the CONTROL file for SQL loader 4.  when finished with the integration, remove the CONTROL file and

the staging table

Page 91: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

•  A Knowledge Module is made of steps

•  Each step has a name and a template for the code to be generated

Page 92: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Journalize Read from CDC

Source

Load From Sources to

Staging

Check Constraints before

Load

Integrate Transform and Move

to Targets

Service Expose Data and Transformation

Services

Reverse Engineer Metadata

Reverse

Journalize

Load

Check

Integrate Services

Pluggable Knowledge Modules Architecture

CDC

Sources

Staging Tables

Error Tables

Target Tables

WS

WS W

S

SAP/R3

Siebel

Log Miner

DB2 Journals

SQL Server Triggers

Oracle DBLink

DB2 Exp/Imp

JMS Queues Check MS Excel

Check Sybase

Oracle SQL*Loader

TPump/ Multiload

Type II SCD

Oracle Merge

Siebel EIM Schema

Oracle Web Services

DB2 Web Services

Sample out-of-the-box Knowledge Modules

Knowledge Modules Hot-Pluggable: Modular, Flexible, Extensible

Page 93: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

KM Types KM Type Description

CKM Check Checks data in a datastore or during an integration process.

IKM Integration Uses a given strategy to populate the target datastore from the staging area.

LKM Loading Assembles data from source datastores to the staging area.

RKM Reverse-engineering

Retrieves the structure of a data model from a database. Only needed for customized reverse-engineering.

JKM Journalizing Sets up a system for Changed Data Capture to reduce the amount of data that needs to be processed.

Inte

rface

s Mo

dels

SKM Web Services Defines the code that will be generated to create Data Web Services (Exposing data as a web service)

Page 94: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Target (Oracle)

Staging Area

Staging Area on Target-Example ETL

Source (Sybase)

ORDERS

LINES

CORRECTIONS File

SALES C$_0

C$_1

I$_SALES

1

2

3

5

Extract/Join/Transform

Extract/Transform

Join/Transform

Transform & Integrate

Page 95: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Target (Oracle)

Staging Area Source (Sybase)

ORDERS

LINES

CORRECTIONS File

SALES C$_0

C$_1

I$_SALES

LKM_1

LKM SQL to Oracle

LKM_2

LKM File to Oracle (SQLLDR)

IKM_1

IKM Oracle Incremental Update

IKM_1

IKM Oracle Incremental Update

Staging Area on Target-Example ETL

Page 96: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Requirements

• To run an interface, you need at least the following: – A target table – An Integration Knowledge Module – A Loading Knowledge Module if there is a remote

source

Page 97: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Code Generation •  When we ask ODI to Execute the transformations, ODI will

generate the necessary code for the execution (usually SQL code)

•  The code is stored in the repository •  The execution details are available in the Operator

Interface: –  Statistics about the jobs (duration, number of records

processed, inserted, updated, deleted) –  Actual code that was generated and executed by the

database –  Error codes and error messages returned by the

databases if any

Page 98: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

ODI Design-Time Environment ODI Runtime Environment

Development Development Servers and Applications

Production Production Servers and Applications

User Interfaces

Overview: 6 steps to Production 1.  Retrieve/Enrich metadata 2.  Design transformations 3.  Orchestrate data flows

4.  Generate/Deploy data flows 5.  Monitor executions 6.  Analyze impact / data lineage

Agent Data Flow Conductor

CRM

Legacy

ERP

Data Warehouse

Files / XML

User Interfaces

Administrators

Designers

ESB

Design-time Repositories Design-time

Repositories

Agent Data Flow Conductor

CRM

Legacy

ERP

Data Warehouse

Files / XML

Operator

Metadata Navigator

ESB

Runtime Repository

Page 99: Data Warehousing and Business Intelligence - The …arisant.com/wp-content/uploads/Data-Warehousing-and-Business... · Oracle Data Warehousing and Business Intelligence - The Complete

Copyright © 2009 Arisant LLC. Oracle, JD Edwards, and PeopleSoft are registered trademarks of Oracle and/or its affiliates.

Q&A • Thank you for attending •  If you have follow-up questions I will be here for the rest of the day or can be contacted by email [email protected]