Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming

Download Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming

Post on 09-Jan-2017

1.398 views

Category:

Data & Analytics

1 download

TRANSCRIPT

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Michael Rainey | Collaborate 16

    Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming

    1

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Introduction

    2

    Michael Rainey - Data Integration Practice Lead - America

    - Oracle Data Integration expertise - Blog: http://ritt.md/mRainey - Oracle ACE

    @mRainey

    http://ritt.md/mRainey

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    About Rittman Mead

    3

    Worlds leading specialist partner for technical excellence, solutions delivery and innovation in Oracle Data Integration, Business Intelligence, Analytics and Big Data

    Providing our customers targeted expertise; we are a company that doesnt try to do everything only what we excel at

    70+ consultants worldwide including 1 Oracle ACE Director and 3 Oracle ACEs, offering training courses, global services, and consulting

    Founded on the values of collaboration, learning, integrity and getting things done

    Unlock the potential of your organizations data

    Comprehensive service portfolio designed to support the full lifecycle of any analytics solution

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead 4

    Visual Redesign Business User Training

    Ongoing SupportEngagement Toolkit

    Average user adoption for BI platforms is below 25%

    Rittman Meads User Engagement Service can help

    More info: http://ritt.md/ue

    http://ritt.md/ue

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Todays New Data Challenge

    5

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Data Integration Today

    6

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Typical Example - Marketing

    7

    Financial data stored in RDBMS

    Social media data, web logs, Google analytics, etc all in

    various formats

    Bring it all together for analysis

    Marketing campaign effect on sales

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Relational Data Replication - Oracle GoldenGate

    8

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Oracle GoldenGate for Big Data (Then)

    9

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Oracle GoldenGate for Big Data (Now)

    10

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Streaming Data - Apache Kafka

    11

    Publish-subscribe messaging rethought as a distributed commit log

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Streaming Data - Apache Kafka

    12

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Kafka - How is it used?

    13

    Pure Event Streams

    System Metrics

    Derived Streams

    Hadoop Data Loads / Data Publishing

    Application Logs

    Database Changes

    - Log Compaction - Data cleansing

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Lets Jump Right In

    14

    An examplenear and dear to my heartOne single view of the Oracle Data Integrator logs!

    - Oracle Data Integrator session logs stored in the repository - ODI Agent logs are text based log files - To see the full picture of your ODI environment, they must be

    combined

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead 15

    oracle.com/technetwork/database/bigdata-appliance/downloads

    http://www.oracle.com/technetwork/database/bigdata-appliance/downloads/index.html

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Oracle GoldenGate for Big Data - we talked about this

    16

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    ODI Agent Logs to Kafka via Logstash

    17

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    ODI Agent Logs to Kafka via Logstash

    17

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Extract from the ODI Repository with GoldenGate 12c

    18

    Prepare the database

    Setup GoldenGate for Oracle Database

    - Install and configure Setup Manager, Extract and Pump parameter files

    Add Extract and Pump process groups

    Start!

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Prepare the Database for GoldenGate Extract - OGG User

    19

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Prepare the Database for GoldenGate Extract - Logging

    20

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Add Table Supplemental Logging

    21

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate Manager Parameter File - Source

    22

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate Extract Parameter File

    23

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate Pump Parameter File

    24

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Add Extract and Pump Process Groups

    25

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Stream ODI Agent Logs to Kafka via Logstash

    26

    Application log processing is a standard use for Kafka

    - Many approaches to extract logs Logstash

    - Part of the Elastic (formerly ELK) stack - Robin Moffatt blogged > http://ritt.md/kafka-elk - Producer configuration for Kafka

    http://ritt.md/kafka-elk

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Logstash to Kafka - Setup and Startup

    27

    Startup Zookeeper

    - Already installed on Big Data Lite Set Kafka server.properties

    - Broker ID - Number of partitions - Log retention period - Zookeeper connection Start Kafka

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Setup Logstash Configuration File

    28

    Configuration File

    Start Logstash

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    ODI Agent Logs to Kafka!

    29

    Start the Kafka Console Consumer - delivered with Kafka

    Start the ODI Agent andmessages!

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    ODI Agent Logs to Kafka!

    30

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate Transactions to Kafka

    31

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Oracle GoldenGate for Big Data

    32

    Kafka one of many handlers

    - HDFS, HBase, Flume Pluggable Formatters

    - Convert trail file transactions to alternate format - Avro, delimited text, JSON, XML Metadata Provider

    - Handles mapping of source to target columns that differ in structure/name - Similar to SOURCEDEF file in GoldenGate - Avro or Hive

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Oracle GoldenGate for Big Data - Kafka Handler

    33

    Standard GoldenGate Extract / Pump processes

    - We just set this up Replicat parameter file & process group

    Kakfa Handler configuration

    Kafka Producer properties

    - Note: Kafka 0.8.2.0 & 0.8.2.1 are certified with GoldenGate

    But, Ive heard 0.9.0+ works

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaPrerequisites

    34

    Zookeeper & Kafka up and running

    Add topic to broker up front vs dynamically

    Kafka Handler must have access to broker server

    Kafka libraries must match Kafka version

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaReplicat Parameters

    35

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaKafka Handler Properties

    36

    Properties allow communication between the GoldenGate adapter and Kafka

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaKafka Handler Properties

    37

    gg.handlerlist = kafkahandler

    gg.handler.kafkahandler.type = kafka

    gg.handler.kafkahandler.KafkaProducerConfigFile = kafka_producer.properties

    gg.handler.kafkahandler.TopicName = odirepo

    - Kafka topic name

    gg.handler.kafkahandler.format = json

    - Pluggable Formatter - Avro recommended for Kafka

    gg.handler.kafkahandler.BlockingSend = true|false gg.handler.kafkahandler.includeTokens = true|false gg.handler.kafkahandler.mode = tx

    - Transaction vs Operation mode

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaKafka Handler Properties

    38

    goldengate.userexit.timestamp = utc

    goldengate.userexit.writers = javawriter

    javawriter.stats.display = TRUE

    javawriter.stats.full = TRUE

    gg.log = log4j

    gg.log.level = INFO

    gg.report.time = 30sec

    gg.classpath = dirprm/:/u01/kafka/kafka_2.10-0.8.2.1/libs/*:

    - Location of the Kafka libraries

    javawriter.bootoptions = -Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaKafka Producer Configuration

    39

    Access to the Kafka producer configuration parameters

    More on this later

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and KafkaStartup

    40

    Create a topic in Kakfa

    Add Replicat process group to GoldenGate on target

    Start Kafka console consumer

    Start GoldenGate extract/pump on source, replicat on target

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate and Kafka Integration Complete!

    41

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Schemas

    42

    Schema automatically created

    - Stored in /dirdef directory - Based on gg.handler.kafkahandler.format setting

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    GoldenGate Big Data Adapter Challenges

    43

    GoldenGate could be a single point of failure

    - Kafka is a fault-tolerant, distributed system Source transactions may end up larger than expected

    - max.request.size Need for speed?

    - batch.size - linger.ms

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Why GoldenGate with Kafka?

    44

    GoldenGate

    - is non-invasive - has checkpoints for recovery - moves data quickly

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    In conclusion

    45

    The new data challenge, not quite as challenging with Kafka

    - Kafka as the raw data reservoir?

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Questions?

    46

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead

    Questions?

    47

    Websites

    - kafka.apache.org - rittmanmead.com/blog Contact

    - info@rittmanmead.com - michael.rainey@rittmanmead.com Twitter

    - @rittmanmead - @apachekafka - @mRainey

  • info@rittmanmead.com www.rittmanmead.com @rittmanmead 48