50_128-498

Upload: ishtiaq-khan

Post on 23-Feb-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/24/2019 50_128-498

    1/27

    Bridging the Big Data Divide with Oracle Data Integr

    Milomir Vojvodic,

    Business Development Manager, EMEA DIS

  • 7/24/2019 50_128-498

    2/27

  • 7/24/2019 50_128-498

    3/27

    Information Architectures Today:

    Decisions based on transactional data

    Video and Images

    Machine-GenerSocial Data

    Information Architectures Today:Decisions based on all your data

    transactions, applications, structure

    Diverse Data Sets

  • 7/24/2019 50_128-498

    4/27

    Architecture Principlesand Best Practices

    Oracle Data Integration Solutiofor Big Data

  • 7/24/2019 50_128-498

    5/27

    TransactionData

    AdvanceAnalytics

    VisualDiscover

    DBMS(OLTP)

    Master &

    Ref Data

    Structured

    DataWarehouse

    Text Analyand Searc

    Reporting

    Dashboar

    Real-Time

    MachineGenerated

    SocialMedia

    Text, ImageVideo, Audio

    Key-ValueData Store

    Unstructured

    Semi-

    st

    ructured

    HadoopCluster w

    MapReduce

    Alerting

    In-DatabaAnalytics

    EPMBI Applicati

    DB Replication

    ETL/ELT

    CDC

    ODS

    Data Marts

    Streaming(CEP Engine)

    Capture Store/Process Integrate Organize Analyz

    Message-Based

    Integrated Architecture

  • 7/24/2019 50_128-498

    6/27

    Integrate Big Data with DW and Transa

    Data StoresOracle

    Big Data ApplianceOracle

    Exadata

    Acquire Organize Analyze & VisualizeStream

    Load from big data processing into your data warehouse for further analysis

    Access your customer information while you process through your big data in order t

  • 7/24/2019 50_128-498

    7/27

    Relational andNon-Relational

    ApplicationSources

    LegacySources

    Complete a

    approach tointegration

    Maximum p

    lower cost ouse, and rel

    Certified forto deliver fa

    Oracle cust

    80% lowe

    Five times

    70% redu

    Oracle Enterprise Data Quality

    Oracle Data Integrator

    Oracle GoldenGate

    Oracle Data Integration Solutions

    http://hadoop.apache.org/
  • 7/24/2019 50_128-498

    8/27

    Architecture Principlesand Best PracticesDB Replica and CDC withinData Integration Layer

  • 7/24/2019 50_128-498

    9/27

    Target D

    OGG

    Source DB

    What is Oracle GoldenGate?

  • 7/24/2019 50_128-498

    10/27

    Target D

    OGG

    Source DB

    First OGG Different iator

    Accessing directly transaction logs

    Second OGG Different iator

    Moving only committed transactions

    What is Oracle GoldenGate?

  • 7/24/2019 50_128-498

    11/27

    New DB/HW/OS/APP

    Fully Active Distributed DB

    Reporting Database and/or D

    Data Warehouse

    OGG

    OGG

    OGG ADG

    OGG

    Zero DowntimeMigrations & Upgrades

    Active/ActiveDB Deployment

    Disaster RecoveryReporting Database

    DW Synchronization

    Migrations&Consolidations

    Oracle DIS Use Cases - OGG

  • 7/24/2019 50_128-498

    12/27

    TIME REQUIRED FOR THE END OF DAYPROCEDURE

    Hours

    NO OF CPUs REQUIRED FOR SAMEPERFORMANCE*

    No Of Required CPUs

    ESTIMATED COLICENSE**

    Estimated Cost

    0

    50

    100

    150

    Year1 Year2 Year3 Year4 Year5

    Currently during the End Of Dayutilizes the Server CPU by 40-50%and the IO by 90%. Probably the IOis the bottleneck.

    0

    20

    40

    60

    80

    100

    120

    Year1 Year2 Year3 Year4 Year5

    Disaster Recovery Testand Development

    Primary Site

    Required No.CPUs can bedoubled

    $-

    $1

    $1

    $2

    $2

    $3

    Year1

    Millions O

    C

    Daily load time canreach 5 days withthe current HW

    OR

    OGG is Log Based Replica

  • 7/24/2019 50_128-498

    13/27

    OR

    Begin, TX 1

    Insert, TX 1

    Begin, TX 2

    Update, TX 1

    Insert, TX 2

    Commit, TX 2

    Begin, TX 3

    Insert, TX 3

    Begin, TX 4

    Commit, TX 3

    Delete, TX 4

    Begin, TX 2

    Insert, TX 2

    Commit, TX 2

    Begin, TX 3

    Insert, TX 3

    Commit, TX 3

    Begin, TX 2

    Insert, TX 2

    Commit, TX 2

    Capture

    Checkpoint

    Pump

    Checkpoint

    OGG Moves Only Committed Transact

  • 7/24/2019 50_128-498

    14/27

    Architecture Principlesand Best PracticesETL and Data Quality withinData Integration Layer

  • 7/24/2019 50_128-498

    15/27

    OLTP & ODSSystems Data

    Warehouse, Data Mart

    OraclePeopleSoft, Siebel, SAP

    Custom Apps

    FilesExcelXML

    Custom

    Reporting

    Packaged

    Applications

    BusinessIntelligence

    Analytics

    DatFedera

    DataWarehousing

    Custom

    Data MartsData Silos

    SQLJa

    Batch Scripts

    Data Hubs

    DataMigration

    DataReplication

    ODI is centralizing all ETL Developmen

  • 7/24/2019 50_128-498

    16/27

    OLTP & ODSSystems Data

    Warehouse, Data Mart

    OraclePeopleSoft, Siebel, SAP

    Custom Apps

    FilesExcelXML

    Custom

    Reporting

    Packaged

    Applications

    BusinessIntelligence

    Analytics

    Oracle Data Integrator

    ODI is centralizing all ETL Developmen

  • 7/24/2019 50_128-498

    17/27

    JournalizeRead fromCDC Source

    LoadFromSources toStaging

    CheckConstraintsbefore Load

    IntegrateTransformand Move toTargets

    ServiceExpose DataandTransformation Services

    ReverseEngineerMetadata

    Reverse

    Journal

    ize

    Load

    Check

    IntegrateServices

    CDC

    Sources

    Staging Tables

    Error Tables

    Target Tables

    W

    S

    W

    SW

    S

    SAP/R3

    Siebel

    Log Miner

    DB2Journals

    SQLServerTriggers

    OracleDBLink

    DB2Exp/Imp

    JMSQueues

    Check MSExcel

    CheckSybase

    OracleSQL*Loader

    TPump/Multiload

    Type IISCD

    OracleMerge

    Siebel EIMSchema

    OracleWebServices

    DB2 WebServices

    Sample out-of-the-box Knowledge Modules

    Benefits

    ODI Know ledge Modules

    ODI Declarat ive Design

    ODI Declarative Design

    Define How : Built -in

    DefineWhatYou Want

    AutomGD

    11

    Define How : Built -in

    DefineWhatYou Want

    AutomG

    11

    Define How : Built -in

    DefineWhatYou Want

    AutomG

    11

    DefineWhatYou Want

    AutomG

    1111

    ODI E-LT

    Staging Server

    OGG

    Second ODI Differentiator

    ODI Declarative Design and ODI Knowledge Modules

    for reusing already written down level SQL code

    Why is ODI different?

  • 7/24/2019 50_128-498

    18/27

    New DB/HW/OS/APP

    Fully Active Distributed DB

    Reporting Database and/or D

    Data Warehouse

    OGG

    OGG

    OGG ADG

    OGG

    Zero DowntimeMigrations & Upgrades

    Active/ActiveHigh Availability

    Query Off-Loadingand Disaster Recovery

    BI&DW Synchronizationand Loading

    Migrations&Consolidations

    ODI

    ODI

    EDQ

    EDQ

    Oracle DIS Use Cases ODI and EDQ

  • 7/24/2019 50_128-498

    19/27

    19 | 2011 Oracle Corporation

    Customer ID Customer Name Address 1 Address 2 City State Zip Country Birth D

    AD23298 Mr Peter Mayhew 9407 Main St Fairfax VA 22031-4001 USA 02/23/6

    VS38611 Dr Ellen Van Der Heijde 144 E Grove St Kingston PA 18704 US 07/12/5

    DC18223 Jalila Abdul-Alim (Do Not Call) 4548 Pennsylvania Ave Apt 205 Kansas City MO 64111-3349 USA 02/23/6

    CO9387A Tayside Computers Inc. 4912 E 41st N Idaho Falls ID 83401 USA 31/03/2

    TZ35019 Mr Zachary P Jahn 98-1731 Ipuala Loop Aiea Hawaii 96701 1710 United States 06/12/8

    CB27843 Mrs Edith Y Baba Junior Baba Real Est. Corp. 209 Stony Point Trl Webster NY USA 11/17/1

    OX80306 Andrew & Mary Baxter 14 Oxbridge Way Milfrod NH 03055-4614 US 05/28/6

    JP70210 Mr RJ & Mrs FB MacDonald 57 Hadleigh Close Westlea Swindon SN5 9BZ MA - USA -

    RD48107 Mr Andy Baxter 14 Oxbridge Wy Milford NH 3056 USA 01/01/0

    Inconsistent formatsAbbreviations

    (often ambiguous)

    Attrib

    m

    Compound Names

    Embedded Additional Information

    Mixed Business & Personal Names

    Multiple Names

    Mis-Fielded DataErroneous Data

    International Date Formats

    Default or Dummy Data

    Why Do We Need Data Quality?

  • 7/24/2019 50_128-498

    20/27

    20

    Product data is much more variable and unpredictable than other da

    10hp motor 115V Yoke mount

    mtr, ac(115) 10 horsepower 115volts

    MOT-10,115V, 48YZ,YOKE

    This 10hp yoke mounted motor is rated for115V with a 5 year warranty

    10 Caballos, Motor, 115 Voltios

    TEAO HP = 10.0 1725RPM 115V 48YZ YOKE MTR

    Motor, TEAO, 1725 RPM, 48YZ, 15 Voltios,Montaje de Yugo, hp = 10

    Item Motor

    Classification 261016

    Power 10 hors

    Voltage 115

    Mounting Yoke

    Why Do We Need Data Quality?

  • 7/24/2019 50_128-498

    21/27

    Profile, Audit, Transform, Parse, Cleanse, Standardize, Match

    One Unified Solution

    Oracle Enterprise Data Quality

  • 7/24/2019 50_128-498

    22/27

    2012 Oracle Corporation Proprietary and Confidential

    300 Berry #1210 SF California

    300

    Berry St

    Unit1210

    San Francisco

    CA

    94158-1670

    PremiseNumber

    ThoroughfareName

    SubPremise

    Locality

    AdministrativeArea

    PostCode

    300

    Berry

    #1210

    SF

    California

    Parse Validate

    Step 1 Extract pieces othe address

    Step 2 Check the pieceagainst the information Global Knowledge Repo

    to complete and find thecorrect abbreviations

    Step 3 Change charactransliterate - if necessa

    Step 4 Find Location

    Latitude 37.775

    Longitude -122.395

    EDQ Address Verification

  • 7/24/2019 50_128-498

    23/27

    2012 Oracle Corporation Proprietary and Confidential

    Architecture Principlesand Best PracticesOracle Data Integratorfor Big Data

  • 7/24/2019 50_128-498

    24/27

    ODI for Big DataHeterogeneous Integration to Hadoop Environm

    Transforms

    Via MapReduce

    Loads

    Oracle Data

    Integrator

    Supports Ha

    Easy to congenerating M

    ODI f Bi D t t O l

    http://hadoop.apache.org/http://hadoop.apache.org/
  • 7/24/2019 50_128-498

    25/27

    ODI for Big Data to OracleOptimized Integration to Oracle Exadata

    Oracle Database,

    Oracle Exadata

    TransformsVia MapReduce

    Loads

    Activates

    Oracle Loader

    for Hadoop

    Oracle Data

    Integrator

    Oracle Big Data Connectors

    Hadoop Cluster

    Oracle Big Data Appliance

    O l D t I t t f Bi D t

    http://hadoop.apache.org/
  • 7/24/2019 50_128-498

    26/27

    Oracle Data Integrator for Big Data

    Simplifiescreation of Hadoop and MapReduce

    productivity

    Integratesbig data heterogeneously via industHadoop, MapReduce, Hive, NoSQL, HDFS

    Unifiesintegration tooling across unstructuredand structured data

    Optimizesloading of big data to Oracle ExadatData Connectors

    Engineeredfor running on and integrating withAppliance via Big Data Connectors

    Putting Together the Unique Advantages

  • 7/24/2019 50_128-498

    27/27