Download - Apache Kafka at LinkedIn
Jay KrepsIntroduction to Apache Kafka
The Plan1. What is Apache Kafka?2. Kafka and Data Integration3. Kafka and Stream Processing
Apache Kafka
Abrief
historyof
ApacheKafka
Characteristics• Scalability of a filesystem– Hundreds of MB/sec/server throughput–Many TB per server
• Guarantees of a database–Messages strictly ordered– All data persistent
• Distributed by default– Replication– Partitioning model
Kafka is about logs
What is a log?
Logs: pub/sub done right
Partitioning
Nodes Host Many Partitions
Producers Balance Load
Consumer’s Divide Up Partitions
End-to-End
Kafka At LinkedIn• 175 TB of in-flight log data per colo• Replicated to each datacenter• Tens of thousands of data producers• Thousands of consumers• 7 million messages written/sec• 35 million messages read/sec• Hadoop integration
Performance• Producer (3x replication):– Async: 786,980 records/sec (75.1 MB/sec)– Sync: 421,823 records/sec (40.2 MB/sec)
• Consumer: – 940,521 records/sec (89.7 MB/sec)
• End-to-end latency: – 2 ms (median)– 14 ms (99.9th percentile)
The Plan1. What is Apache Kafka?2. Kafka and Data Integration3. Kafka and Stream Processing
Data Integration
Maslow’s Hierarchy
For Data
New Types of Data• Database data– Users, products, orders, etc
• Events– Clicks, Impressions, Pageviews, etc
• Application metrics– CPU usage, requests/sec
• Application logs– Service calls, errors
New Types of Systems• Live Stores– Voldemort– Espresso– Graph– OLAP– Search– InGraphs
• Offline– Hadoop– Teradata
Bad
Good
Example: User views job
Comparing Data Transfer Mechanisms
The Plan1. What is Apache Kafka?2. Kafka and Data Integration3. Kafka and Stream Processing
Stream Processing
Stream processing is ageneralization
of batch processing
Stream Processing = Logs + Jobs
Examples• Monitoring• Security• Content processing• Recommendations• Newsfeed• ETL
Frameworks Can Help
Samza Architecture
Log-centric Architecture
Kafkahttp://kafka.apache.org
Samzahttp://samza.incubator.apache.org
Log Bloghttp://linkd.in/199iMwY
Benchmark:http://t.co/40fkKJvanx
Mehttp://www.linkedin.com/in/jaykreps
@jaykreps