kafka and storm a gentle introduction of - percona integrators a gentle... · a gentle introduction...
TRANSCRIPT
A Gentle Introduction of Kafka and Storm
Drew Nelson
{Percona University | Raleigh}
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Open Software Integrators• Open Software Integrators is a Big Data consulting and services
company specializing in Hadoop, Cassandra, MongoDB and other NoSQL technologies. OSI focuses on executive strategy, initial install, design and implementation.
• Founded January 2008 by Andrew C. Oliver
• Based in downtown Durham, NC
• Partnered with Hortonworks, MongoDB, DataStax, Cloudera, Couchbase, Cloudbees & Neo Technology
Kafka and Storm
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
A Gentle Introduction• What Kafka and Storm are?• What they can be used for?• What they excel at?
Kafka and Storm
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
What is Apache Kafka?
Kafka is a distributed, partitioned, replicated commit log service.
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
The Commit Log
An append-only, immutable sequence of records ordered by time.
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
firstrecord
next writtenrecord
Kafka and Storm
Kafka is:
● fast● durable● distributed● scalable
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Kafka and Storm
Kafka abstractions
● Topic: feeds of messages in categories● Broker: a host running Kafka● Producer: a process that publishes messages● Consumer: a process that pulls messages● Partition: portion of a topic’s stream of messages
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Kafka and Storm
What Kafka is used for:
Enterprise-grade event streaming
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Kafka and Storm
What Kafka is not good at:
Doing anything other than being a commit log.
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Kafka and Storm
What is Apache Storm?
Storm is a distributed, real time computation system
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
Stream processing
● AKA Event Sourcing ● Command and Query Responsibility Segregation● Complex Event Processing● etc.
Several process fail into the domain of stream processing.
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
What Storm does
● Simple API● Guaranteed data processing● Fault tolerant● Scalable● Usable with any language
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
Storm abstractions
Three abstractions:● Spouts● Bolts● Topology
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
SpoutSpout
BoltBoltBolt
Bolt
Kafka and Storm
Storm processes
Processes:● UI● Nimbus● Supervisor● Worker
Drew Nelson
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Supervisor
Worker
Worker
Supervisor
Worker
Worker
Zookeeper
Web UI Nimbus
Kafka and Storm
Storm parallelism model
● Worker process● Executors● Tasks
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Use Case: Security
Kafka and Storm
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Kafka and Storm
Use Case: Security
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Security customer analytics platform ● Pulling data from customer sites, ● Placed data in a SQL database ● Performing analysis to spot anomalous traffic ● Pushing results back to client to blocking traffic
sources
Kafka and Storm
Use Case: Security
Original system mean turn around time: 4.5 hoursStorm / Kafka solution, maximum processing time:
2.6 seconds
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson
Kafka and Storm
Links
Kafka: http://kafka.apache.org/Storm: http://storm.apache.org/
{Open Software Integrators} { www.osintegrators.com} {@osintegrators}
Drew Nelson