oracle goldengate and apache kafka a deep dive into real-time data streaming

Post on 09-Jan-2017

1.422 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Michael Rainey | Collaborate 16

Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming

1

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Introduction

2

• Michael Rainey - Data Integration Practice Lead - America- Oracle Data Integration expertise - Blog: http://ritt.md/mRainey - Oracle ACE

@mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead

About Rittman Mead

3

•World’s leading specialist partner for technical excellence, solutions delivery and innovation in Oracle Data Integration, Business Intelligence, Analytics and Big Data

•Providing our customers targeted expertise; we are a company that doesn’t try to do everything… only what we excel at

•70+ consultants worldwide including 1 Oracle ACE Director and 3 Oracle ACEs, offering training courses, global services, and consulting

•Founded on the values of collaboration, learning, integrity and getting things done

Unlock the potential of your organization’s data

•Comprehensive service portfolio designed to support the full lifecycle of any analytics solution

info@rittmanmead.com www.rittmanmead.com @rittmanmead 4

Visual Redesign Business User Training

Ongoing SupportEngagement Toolkit

Average user adoption for BI platforms is below 25%

Rittman Mead’s User Engagement Service can help

More info: http://ritt.md/ue

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Today’s New Data Challenge

5

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Data Integration Today

6

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Typical Example - Marketing

7

• Financial data stored in RDBMS• Social media data, web logs, Google analytics, etc all in

various formats• Bring it all together for analysis ‣ Marketing campaign effect on sales

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Relational Data Replication - Oracle GoldenGate

8

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle GoldenGate for Big Data (Then)

9

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle GoldenGate for Big Data (Now)

10

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Streaming Data - Apache Kafka

11

“Publish-subscribe messaging rethought as a distributed commit log”

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Streaming Data - Apache Kafka

12

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Kafka - How is it used?

13

• Pure Event Streams• System Metrics• Derived Streams• Hadoop Data Loads / Data Publishing• Application Logs• Database Changes- Log Compaction - Data cleansing

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Let’s Jump Right In

14

• An example…near and dear to my heartOne single view of the Oracle Data Integrator logs!

- Oracle Data Integrator session logs stored in the repository - ODI Agent logs are text based log files - To see the full picture of your ODI environment, they must be

combined

info@rittmanmead.com www.rittmanmead.com @rittmanmead 15

oracle.com/technetwork/database/bigdata-appliance/downloads

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle GoldenGate for Big Data - we talked about this…

16

info@rittmanmead.com www.rittmanmead.com @rittmanmead

ODI Agent Logs to Kafka via Logstash

17

info@rittmanmead.com www.rittmanmead.com @rittmanmead

ODI Agent Logs to Kafka via Logstash

17

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extract from the ODI Repository with GoldenGate 12c

18

• Prepare the database • Setup GoldenGate for Oracle Database- Install and configure • Setup Manager, Extract and Pump parameter files• Add Extract and Pump process groups• Start!

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Prepare the Database for GoldenGate Extract - OGG User

19

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Prepare the Database for GoldenGate Extract - Logging

20

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Add Table Supplemental Logging

21

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate Manager Parameter File - Source

22

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate Extract Parameter File

23

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate Pump Parameter File

24

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Add Extract and Pump Process Groups

25

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Stream ODI Agent Logs to Kafka via Logstash

26

• Application log processing is a standard use for Kafka- Many approaches to extract logs • Logstash- Part of the Elastic (formerly ELK) stack - Robin Moffatt blogged —> http://ritt.md/kafka-elk - Producer configuration for Kafka

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Logstash to Kafka - Setup and Startup

27

• Startup Zookeeper- Already installed on Big Data Lite • Set Kafka server.properties- Broker ID - Number of partitions - Log retention period - Zookeeper connection • Start Kafka

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Setup Logstash Configuration File

28

• Configuration File

• Start Logstash

info@rittmanmead.com www.rittmanmead.com @rittmanmead

ODI Agent Logs to Kafka!

29

• Start the Kafka Console Consumer - delivered with Kafka

• Start the ODI Agent and…messages!

info@rittmanmead.com www.rittmanmead.com @rittmanmead

ODI Agent Logs to Kafka!

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate Transactions to Kafka

31

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle GoldenGate for Big Data

32

• Kafka one of many handlers- HDFS, HBase, Flume • Pluggable Formatters- Convert trail file transactions to alternate format - Avro, delimited text, JSON, XML • Metadata Provider- Handles mapping of source to target columns that differ in structure/name - Similar to SOURCEDEF file in GoldenGate - Avro or Hive

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle GoldenGate for Big Data - Kafka Handler

33

• Standard GoldenGate Extract / Pump processes- We just set this up • Replicat parameter file & process group• Kakfa Handler configuration• Kafka Producer properties- Note: Kafka 0.8.2.0 & 0.8.2.1 are certified with GoldenGate

• But, I’ve heard 0.9.0+ works…

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Prerequisites

34

• Zookeeper & Kafka up and running• Add topic to broker up front vs dynamically• Kafka Handler must have access to broker server• Kafka libraries must match Kafka version

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Replicat Parameters

35

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Kafka Handler Properties

36

• Properties allow communication between the GoldenGate adapter and Kafka

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Kafka Handler Properties

37

• gg.handlerlist = kafkahandler• gg.handler.kafkahandler.type = kafka• gg.handler.kafkahandler.KafkaProducerConfigFile = kafka_producer.properties• gg.handler.kafkahandler.TopicName = odirepo- Kafka topic name • gg.handler.kafkahandler.format = json- Pluggable Formatter - Avro recommended for Kafka… • gg.handler.kafkahandler.BlockingSend = true|false • gg.handler.kafkahandler.includeTokens = true|false • gg.handler.kafkahandler.mode = tx- Transaction vs Operation mode

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Kafka Handler Properties

38

• goldengate.userexit.timestamp = utc• goldengate.userexit.writers = javawriter• javawriter.stats.display = TRUE• javawriter.stats.full = TRUE• gg.log = log4j• gg.log.level = INFO• gg.report.time = 30sec• gg.classpath = dirprm/:/u01/kafka/kafka_2.10-0.8.2.1/libs/*:- Location of the Kafka libraries• javawriter.bootoptions = -Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Kafka Producer Configuration

39

• Access to the Kafka producer configuration parameters

More on this later

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka…Startup

40

• Create a topic in Kakfa

• Add Replicat process group to GoldenGate on target

• Start Kafka console consumer

• Start GoldenGate extract/pump on source, replicat on target

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate and Kafka Integration Complete!

41

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Schemas…

42

• Schema automatically created - Stored in <ogg_home>/dirdef directory - Based on gg.handler.kafkahandler.format setting

info@rittmanmead.com www.rittmanmead.com @rittmanmead

GoldenGate Big Data Adapter Challenges

43

• GoldenGate could be a single point of failure - Kafka is a fault-tolerant, distributed system • Source transactions may end up larger than expected - max.request.size • Need for speed?- batch.size - linger.ms

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Why GoldenGate with Kafka?

44

• GoldenGate…- is non-invasive - has checkpoints for recovery - moves data quickly

info@rittmanmead.com www.rittmanmead.com @rittmanmead

In conclusion

45

• The new data challenge, not quite as challenging with Kafka- Kafka as the raw data reservoir?

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Questions?

46

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Questions?

47

• Websites- kafka.apache.org - rittmanmead.com/blog • Contact- info@rittmanmead.com - michael.rainey@rittmanmead.com • Twitter- @rittmanmead - @apachekafka - @mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead 48

top related