golden gate autoconfigurator ermöglicht agiles … gate autoconfigurator ermöglicht agiles...
TRANSCRIPT
Golden Gate Autoconfigurator ermöglicht agiles Vorgehen für Data Lake Belieferungen
Hannover, 2015-03-10
Peter Birwe
Dr. Dierk Hahn
Insert Client/Partner logo
Copyright © Capgemini 2012. All Rights Reserved
3 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Copyright © Capgemini 2012. All Rights Reserved
4 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Copyright © Capgemini 2012. All Rights Reserved
5 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Capgemini nutzt und erweitert erfolgreich seine globale Business Information Management Service Line (BIM)
Capgeminis globale Reichweite mit Geschäftsfeldern in über 44 Ländern und dem speziellen Fokus auf Business Information Management mit über 8.000 erfahrenen Mitarbeitern ermöglicht umfassende End-to-End-Services auch in hochmodernen und innovativen Themenfeldern.
Capgemini verfügt mit seiner globalen Intelligent-Enterprise-Methode über einen einzigartig integrierten Ansatz zur Beschreibung weitreichender Informationsstrategien. Industriespezifische BIM Lösungsbausteine unterstreichen das tiefgreifende Fachwissen von Capgemini.
Mit seinem Rightshore®-Ansatz besitzt Capgemini ein marktweit führendes Rahmenwerk zur Entwicklung und zum Betrieb von BIM-Lösungen. Wichtiger Bestandteil ist das BIM Center-of-Excellence in Indien mit über 2.000 Mitarbeitern.
Capgemini pflegt langjährige Partnerschaften mit allen großen BI-Softwareherstellern, um die für die Kundenbedürfnisse optimale Lösung zu entwickeln.
BIM – Eine starke globale Einheit!
Canada
United States
Mexico
Brazil
Argentina
Morocco
Australia
China
India
NetherlandsPolandSpainSwedenSwitzerlandUK
AustriaFinlandFranceItalyGermanyNorway
Global Partner
Outstanding
Collaboration Award
2011
Most valuable
partner award
2010
Innovation award 2009 & 2010
Epic award for Contribution
Revenue 2011 & 2012
Diamond Partner
BI Specialized Partner
Global Applications Partner
of the Year Award 2012
IIG President’s award
for Customer
Satisfaction 2012
Software's Most
Innovative Alliance
Partner of the Year
2011
Pinnacle Awards
2 in 2012
1 in 2011, 2 in 2010
Services Partner of
the Year award
2012
Copyright © Capgemini 2012. All Rights Reserved
6 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Trends in Data Warehouse und Business Intelligence Influence DWH Architectures
In Memory
Databases
In Memory
Databases
No SQL / No SQL / Big Data
Agile BI
Appliances
Cloud BI
BI as a self BI as a self service
Column
Databases
Column Based
Databases
Copyright © Capgemini 2012. All Rights Reserved
7 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Sand-boxing
Analytics
Social
Media
Analytics
DWH
Konsoli-
dierung
Industriali-Industriali-sierung
Data Lake
Stakeholders Perspective
Business Business view
IT development Technical view
Operations Infrastructure view
Perspectives of different stakeholders on BIM reference architecture
Perspectives of different stakeholders on BIM reference architecture
Our reference architecture describes the essential BI and BDA services independent from vendors
Operational applications and processes
Data Sources Operational
IT-Systems
Package based apps,
Custom built apps
Data Management
End users
and roles
Data integration
Information
delivery
Reference and
Market data
D&B,
S&P, …
Informal data
Extraction
Transformation
Data quality
Load
Enterprise
Service
Bus
Staging Area
BI Frontend
Tools
BI Applications (LOB‘s, Verticals)
Risk Mgmt. Compliance Performance Mgmt. Relationship Mgmt.
IDOC File XML SQL ODBC JDBC
SQL MDX xQuery PMML ODBO XML/A BAPI ODBC JDBC
Meta
da
ta M
an
ag
em
en
t
BAPI
Data Warehouse
Pro
cess M
an
ag
em
en
t
Syste
m M
an
ag
em
en
t
Operational
Data Store
Master Data Hubs
Customers,
Products,
…
…
Data
Steward Author Recipient
BI Platform
Portal, Search, Collaboration, Personalization, Semantic Access Layer, Security, Scheduling, SDK, (Web) Services
Structured data
Unstructured data
Big Data
Sensor logs, social feeds,
clickstream, server logs, audio,
video, image, documents, …
Data Marts
R
File System
Hive MR
Data Streams
Data Virtualization
Data Streaming
Administrator
Pri
vacy a
nd
Secu
rity
Co
mp
lex E
ven
t P
rocessin
g
Metadata
CWMI
Reporting Analysis Adv. Analytics Alerting Dashboarding Scorecarding Planning Data
Exploration
NoSQL
Text Processing /
NLP
Copyright © Capgemini 2012. All Rights Reserved
10 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Source 1 Source 1 Source 1
Lake 2 Lake 2 Lake 2
DWH 1 DWH 1 DWH 1
Temp Temp Temp
Application specific data provisioning leads to many necessary changes to cover reporting demands
Copyright © Capgemini 2012. All Rights Reserved
11 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Source 2 Source 2 Source 2
Source 3 Source 3 Source 3
DB
Rep- DB
Copy Copy
Copy Copy
CDC CDC Stream Stream
agil changes
agil changes
agil changes
Real situation Real situation
Source systems enforce different
Source systems enforce different
options for data extraction
Temporary systems (often leftover
legacy systems) are used as data
source
Change-Data-Capture (CDC) is
employed at different locations with
multiple sometimes performance
intensive methods
Report changes have impact on
multiple source and temp systems
and procedures
Administration knowhow is dispersed Real Time Data
CD
C
CD
C
Sync Stage Sync Stage Sync Stage
Lake 2 Lake 2 Lake 2
DWH 1 DWH 1 DWH 1
The central synchronization stage provides a unique data source with a common CDC area
Copyright © Capgemini 2012. All Rights Reserved
12 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Unique stage Unique stage
Data is centralized independent from
Data is centralized independent from
application
Synchronization is task of the stage
Modern stage serves as central and
cost-effective data source for
subsequent systems
Change-Data-Capture (CDC) has a
strict definition and is optimized for
performance
Integration of new source systems is
very fast
Administration knowhow is
centralized
Possibility of a central archive
License costs are bundled on Stage
CDC CDC
agil/ industrialized agil/ industrialized
Central Archive
Source 1 Source 1 Source 1
Source 2 Source 2 Source 2
Source 3 Source 3 Source 3
Real Time Data
Sync Sync
Sync Sync
Sync Sync
A performant and unique CDC area is provided for all applications
13
OGG
Replication Area
.
CDC Area
OGG
CDC out of the BOX CDC out of the BOX
All data entries (REPL and CDC)
All data entries (REPL and CDC)
receive additional timestamp and
source information
REPL represents a selected online
synchronization of the source system
CDC only inserts all data changes
(very fast)
The operation flag shows original
data change and allows filtering and
different handling on subsequent
systems
Equal treatment for all applications
Insert Update Delete
Copyright © Capgemini 2012. All Rights Reserved
14 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Log-based
Changed Data
Data Integrator
New DB/
HW/OS/APP
Fully Active
Distributed
DB
Reporting
Database
Data
Warehouse
ODS
Zero Downtime Upgrade & Migration
Zero Downtime Upgrade & Migration
Query & Report Offloading Query & Report Offloading
Data Synchronization within the Enterprise
Data Synchronization within the Enterprise
Real-time BI, Operational Reporting, MDM
Real-time BI, Operational Reporting, MDM
Event Driven Architecture, SOA
Event Driven Architecture, SOA
High Availability/ Disaster Recovery
High Availability/ Disaster Recovery
Messag
e Bus
Oracle & Non-Oracle
Database(s)
Message Bus
Legacy Systems
Oracle Golden Gate *
Global
Data
Centers
Customer decided to use OGG synchronization for data transfer to its Corp DWH.
All application services should not be impacted.
Low-Impact, Real-Time Data Integration & Transactional Replication
*Source: ORACLE PRESENTATION
How Oracle Golden Gate works*
Target
Oracle & Non-Oracle
Database(s)
Extract: committed transactions are captured (and can be filtered) by reading the transaction logs.
Trail: stages and queues data for routing.
Pump: distributes data for routing to target(s).
Route: data is compressed, encrypted for routing to target(s).
Replicat: applies data with transaction
integrity, transforming the data as required.
Replicat: applies data with transaction
integrity, transforming the data as required.
Extract
Replicat
Trail
Files Pump
Trail
Files Pump
Replicat
Extract
Bi-directional (possible, not used)
LAN / WAN / Internet
Over TCP/IP Trail
Files
Trail
Files Pump
Pump
Source
Oracle & Non-Oracle
Database(s)
OGG also supports
Downstream Capture
for less impact on
source system
*Source: ORACLE PRESENTATION
Extract / Collect
Datapump CDC Area
Datapump Repl. Area
Replicat/ Apply
CDC Area
Replicat/ Apply
Repl. Area
Copyright © Capgemini 2012. All Rights Reserved
17 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Standard replication architecture for a staging area with Change Data Capture
Redo Log
Mining DB
Interface
Area
Repl
Area
CDC
Area
Target Database
Sta
gin
g
OG
G E
nvir
on
men
t ru
nnin
g o
n targ
et data
base
Source Schema
Source Database
Architecture features Architecture features
Almost no source performance impact when
Collect process is configured with additional
Almost no source performance impact when
using a mining DB
Collect process is configured with additional
parameters so the trail files entail all update
information
Datapumps double trail file for REPL and
CDC
CDC and REPL area are independent
CDC apply process automatically writes
complete update information in CDC tables
Primary key updates are included because
all information is already in trail file
Fast CDC apply (only inserts)
Low time lag for REPL tables
Higher reliability because of separation of
REPL and CDC apply processes
Access grants are managed by Interface
area
Deliverable
Initia
l L
oa
d/D
ata
Mig
ratio
n
SYNC
Legend
Advantages Advantages
ImpactImpact
TechniqueTechnique
Operation andadministrationOperation and administration
BudgetBudget
Advantages of the synchronous stage within a real application landscape
Copyright © Capgemini 2012. All Rights Reserved
18 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Real application landscape
Minimal loss of performance
No modification of the source
No active components on source beside backup
Most DBMS can be integrated
CDC is provided for all source systems
OGG guarantees data completeness
Central OGG platform unifies operations
Focus for administration
Administrators are used to deployments and migrations
Only one target DB to be (OGG) licensed
Central budgeting
Central administration costs
Copyright © Capgemini 2012. All Rights Reserved
19 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Extract / Collect
Datapump CDC Area
Datapump Repl. Area
Replicat/ Apply
CDC Area
Replicat/ Apply
Repl. Area
The synchronization architecture needs a coordinated configuration of OGG and the target data base
Copyright © Capgemini 2012. All Rights Reserved
20 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Redo Log
Mining DB
Interface
Area
Repl
Area
CDC
Area
Target Database
OG
G E
nvir
on
men
t ru
nnin
g o
n targ
et data
base
Source Schema
Source Database
Deployment configuration Deployment configuration
The source data base is configured to
deliver all synchronization data
Data of selected tables are collected
Data pumps guarantee the delivery of the
right data
The apply processes
map the source data to the target
data base
incorporate the CDC and meta data
The DDLs create the needed stage
Migration scenarios can be used
Parameters Table Lists
DDL Migration Data fill
Mappings
DDL Migration Data fill
DDL Migration Data fill
Sta
gin
g
Supplemental logging
Mappings
Mappings Table Lists
SYNC
Extract / Collect
Datapump CDC Area
Datapump Repl. Area
Replicat/ Apply
CDC Area
Replicat/ Apply
Repl. Area
All Deployment files are aligned and will lead to harmonic and save operation without deployment risks
Copyright © Capgemini 2012. All Rights Reserved
21 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Redo Log
Mining DB
Interface
Area
Repl
Area
CDC
Area
Target Database
OG
G E
nvir
on
men
t ru
nnin
g o
n targ
et data
base
Source Schema
Source Database
Sta
gin
g
Config
Modules, Attribute-exclude/extension, where-clauses
TableList Config
OGG Configurator
Application Tables
DEF
Working Tables
once per Application Release
INPUT Application: Source DB
schema with Definitions
Deliver: source_
definition .csv
DEF
Write DEF
Read DEF
SQL-Cmd
Deliver
Staging DDLs
Deliver
OGG Lists+
Mappings
Deliver
Admin Params
SYNC
Automated generation Automated generation
Complete deployment is generated by
button press
Table selection is included from
configuration
Per table options are incorporated
The OGG configuration can be parameterized by many options and powerful modules
Copyright © Capgemini 2012. All Rights Reserved
22 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Business requirements and IT parameter will be applied automatically using easy configuration
Option: exclude columns Option: exclude columns
Exclude specific columns from Exclude specific columns from
interface views,separately for
REPL (1:1 replicated tables)
and CDC
Option: Data Selection Option: Data Selection
Add a where clause to interface
Add a where clause to interface
views to filter entries
(separately for REPL and CDC)
Modules Modules
AddCL: add a CDC table and a
ExcludeBlobsFromMapping: columns
synchronized (use to save space and
AddCL: add a CDC table and a
corresponding view (use for change
tracking)
AddSyncInfo: add synchronization
info (Load timestamp and commit
sequence number) to tables (e.g. for
use for incremental loading)
ExcludeColumnGlobalFromView:
Hide all columns with specific
globally defined name pattern (e. g.
use for hiding private data)
ExcludeBlobsFromMapping: columns
with data type blob will not be
synchronized (use to save space and
performance on incompatible or
nonreportable information)
ExcludeBlobsFromView:
hide blob columns in interface view
GenerateCTAS (create table as
select)
Input: table names and options Input: table names and options
A table list functions as input to
Short name of long table names
Each option and module can be
A table list functions as input to
list all tables to be included in
staging and interface views.
Short name of long table names
(>27 letter) are defined for
overall compatibility when used
with postfixes.
Each option and module can be
configured/selected individually
for every table in the table list.
Global parameters are set for
some options or defaults
Copyright © Capgemini 2012. All Rights Reserved
24 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Release notes Release notes
The data model of the source and the
by partial loading updated data/tables
The data model of the source and the
target are connected
Release process is perfectly aligned
Synchronization database
maintenance can be done after
application migration on work days
Synchronization is kept by stored log
files
Migration can be applied directly or
by partial loading updated data/tables
The release process is to be integrated into the application releases if data model changes are done
No “Sunday”-work on Target-DB
As the source is normally under a high SLA migration must be planned under budget considerations
Copyright © Capgemini 2012. All Rights Reserved
25 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Agenda
Capgemini
New Architectures in the century of Big Data
Stage Architecture for Applications
Synchronization using ORACLE Golden Gate
Industrial Connecting Applications to a Stage
Lessons Learned
Target Server
Mining DB
Complex environments can be maintained by a small team and contain many different deployments
26
Source-Server
1
Appl.-DB
Source 2
(INT 4 )
Source 3
(DEV)
TableList 2 COLLECT
INT REPL
INT CL
DEV REPL
DEV CL
DATA-PUMP APPLY
TableList 3
Staging INT
SE
(JIT)
CG
Schema Trail-File Process
Trail-Files Trail-File
Source 1
(INT 5)
PAR REPL
PAR CL TableList 1
SE
(PAR)
Source-Server
2
Appl.-DB
Source
Mining DB
COLLECT
TableList
MM REPL
MM CL
MM
Source 2
Source 3
Source 1
Source
COLLECT
Logging
Logging
Copyright © Capgemini 2012. All Rights Reserved
2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
INT REPL
CL FRI
PAR REPL
CL FRI
DEV REPL
CL FRI
MM REPL
CL FRI
Example of a running environment
Customer requirements Customer requirements
Real landscapeReal landscape
Short response timeShort response time
QualityQuality
BudgetBudget
Lessons learned: The new “Insight and Data” approach covers the real requirements of customers
Copyright © Capgemini 2012. All Rights Reserved
27 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Lessons learnt
Even from given legacy applications reporting data can be collected
Changes of applications or new applications do not bother
Data collection costs almost no application performance
Synchronization architecture allows real time reporting
Data lake collects all data and is prepared for upcoming requirements
Industrialized development generates quick deployments
Application changes can be integrated per button click
Industrialized development is short and cost-efficient
Centralized architecture avoids double licenses
Simple administration saves budget
Industrialized development guarantees *error free* deployments
Deployment content driven by requirements
Simple administration avoids manual errors
Contact information
Copyright © Capgemini 2012. All Rights Reserved
28 2015_03_10_ORACLE_DWH_Golden Gate Autoconfigurator_V2.pptx
Peter Birwe
Head of Delivery BI&DWH
BIM Germany
Wanheimerstrasse 68
40468 Düsseldorf
Phone: + 49 211 56623 160
E-Mail: [email protected]
Dr. Dierk Hahn
Senior Managing Consultant
BIM Germany
Wanheimerstrasse 68
40468 Düsseldorf
Phone: + 49 211 56623 187
E-Mail: [email protected]
www.capgemini.com
About Capgemini
With more than 120,000 people in 40 countries, Capgemini is one
of the world's foremost providers of consulting, technology and
outsourcing services. The Group reported 2011 global revenues
of EUR 9.7 billion.
Together with its clients, Capgemini creates and delivers
business and technology solutions that fit their needs and drive
the results they want. A deeply multicultural organization,
Capgemini has developed its own way of working, the
Collaborative Business ExperienceTM, and draws on Rightshore ®,
its worldwide delivery model.
Rightshore® is a trademark belonging to Capgemini
The information contained in this presentation is proprietary.
Copyright © 2012 Capgemini. All rights reserved.