2015-04-15 | apache kafka (vienna scala user group)
TRANSCRIPT
![Page 1: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/1.jpg)
Dominik Gruber, @the_dom Scala Vienna User Group – April 15, 2015
Apache Kafka
![Page 2: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/2.jpg)
Dominik Gruber • @the_domApache Kafka
Apache Kafka• Originally developed by LinkedIn
• Open Sourced in 2011
• Written in Scala
• Clients for every popular language
• Version 0.8.2.1
• http://kafka.apache.org
![Page 3: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/3.jpg)
Dominik Gruber • @the_domApache Kafka
Users• Everyone, really …
• LinkedIn, Yahoo!, Twitter, Netflix, Square, Spotify, Pinterest, Uber, Goldman Sachs, Tumblr, PayPal, Box, Airbnb, Mozilla, Cisco, Foursquare,…
• https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
![Page 4: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/4.jpg)
Dominik Gruber • @the_domApache Kafka
Apache Kafka
“A high throughput distributed messaging system.”
![Page 5: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/5.jpg)
Dominik Gruber • @the_domApache Kafka
Apache Kafka
“Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.”
![Page 6: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/6.jpg)
Dominik Gruber • @the_domApache Kafka
Apache Kafka
“Kafka is a distributed, partitioned, replicated commit log service.”
![Page 7: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/7.jpg)
Dominik Gruber • @the_domApache Kafka
Claims
• Fast
• Scalable
• Durable
• Distributed by Design
![Page 8: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/8.jpg)
Dominik Gruber • @the_domApache Kafka
Claims• Fast
• A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
• Scalable
• Durable
• Distributed by Design
![Page 9: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/9.jpg)
Dominik Gruber • @the_domApache Kafka
Claims• Fast
• Scalable
• Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine (…)
• Durable
• Distributed by Design
![Page 10: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/10.jpg)
Dominik Gruber • @the_domApache Kafka
Claims• Fast
• Scalable
• Durable
• Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
• Distributed by Design
![Page 11: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/11.jpg)
Dominik Gruber • @the_domApache Kafka
Claims• Fast
• Scalable
• Durable
• Distributed by Design
• Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
![Page 12: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/12.jpg)
Dominik Gruber • @the_domApache Kafka
Design
http://kafka.apache.org/documentation.html
![Page 13: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/13.jpg)
Dominik Gruber • @the_domApache Kafka
Design
http://kafka.apache.org/documentation.html
![Page 14: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/14.jpg)
Dominik Gruber • @the_domApache Kafka
Design
“The performance of linear writes on a JBOD configuration with six 7200rpm SATA RAID-5
array is about 600MB/sec but the performance of random writes is only about 100k/sec—a
difference of over 6000X.”
![Page 15: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/15.jpg)
Dominik Gruber • @the_domApache Kafka
Design
http://kafka.apache.org/documentation.html
![Page 16: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/16.jpg)
Dominik Gruber • @the_domApache Kafka
Use Cases
“We designed Kafka to be able to act as a unified platform for handling all the real-time data feeds a large company
might have.”
![Page 17: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/17.jpg)
Dominik Gruber • @the_domApache Kafka
Use Cases• Messaging
• Website Activity Tracking
• Metrics
• Log Aggregation
• Stream Processing
• Event Sourcing
• Commit Log
![Page 18: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/18.jpg)
Dominik Gruber • @the_domApache Kafka
Demo
![Page 19: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/19.jpg)
Dominik Gruber • @the_domApache Kafka
Q & A
![Page 20: 2015-04-15 | Apache Kafka (Vienna Scala User Group)](https://reader035.vdocuments.us/reader035/viewer/2022080901/55a788ca1a28ab70188b48d4/html5/thumbnails/20.jpg)
Dominik Gruber • @the_domApache Kafka
Further reading• http://engineering.linkedin.com/distributed-systems/log-
what-every-software-engineer-should-know-about-real-time-datas-unifying
• http://blog.confluent.io/2015/04/07/hands-free-kafka-replication-a-lesson-in-operational-simplicity
• http://www.slideshare.net/wangxia5/netflix-kafka
• https://metamarkets.com/2015/simplicity-stability-and-transparency-how-samza-makes-data-integration-a-breeze