kafka short

Download Kafka short

Post on 27-Jan-2015

104 views

Category:

Software

1 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • 1. 1 Kafka forKafka for BigDataBigData ProcessingProcessing Yanai Franchi , TikalYanai Franchi , Tikal

2. 2 Find Hot Places 3. 3 4. 4 gogobot checkin Heat Map Service Lets' Develop Gogobot Checkins Heat-Map 5. 5 Key Notes Collector Service - Collects checkins as text addresses We need to use GeoLocation ServiceWe need to use GeoLocation Service Upon elapsed interval, the last locations list will be displayed as Heat-Map in GUI. Web Scale service 10Ks checkins/seconds all over the world (imaginary, but lets do it for the exercise). 6. 6 Heat-Map Context Text-Address Checkins Heat-Map Service Gogobot System Gogobot Micro Service Gogobot Micro Service Gogobot Micro Service Geo Location Service Get-GeoCode(Address) Heat-Map Last Interval Locations 7. 7 Tons of Addresses Arriving Every Second 8. 8 First Reaction... 9. 9 Checkin HTTP Reactor Checkins Topic Storm Heat-Map Topology Hotzones Topic Web App Push via WebSocket Publish Checkins HDFS Checkin HTTP Firehose 10. 10 11. 11 They all are Good But not for all use-cases 12. 12 Kafka A little introduction 13. 13 14. 14 Why ? 15. 15 LinkedIn Original Architecture 16. 16 17. 17 What LinkedIn Want... 18. 18 Looks Familiar : Use Messaging (i.e. JMS, RabbitMQ) 19. 19 20. 20 21. 21 22. 22 23. 23 It Didn't Scale... 24. 24 Paradigm Change : Do NOT track message consumption 25. 25 26. 26 27. 27 28. 28 Stateless Broker & Doesn't Fear the File System 29. 29 Topics Logical collections of partitions (the physical fi les). A broker contains some of the partitions for a topic 30. 30 A partition is Consumed by Exactly One Group's Consumer 31. 31 Distributed & Fault-Tolerant 32. 32 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 33. 33 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 34. 34 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 35. 35 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 36. 36 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 37. 37 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 38. 38 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 39. 39 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 40. 40 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 41. 41 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2 42. 42 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2 43. 43 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2 44. 44 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2 45. 45 Performance Benchmark 1 Broker 1 Producer 1 Consumer 46. 46 47. 47 48. 48 LinkedIn Kafka Performance (2012) 8 nodes per datacenter ~20 GB RAM available for Kafka~20 GB RAM available for Kafka 6TB storage, RAID 10, basic SATA drives6TB storage, RAID 10, basic SATA drives 10 billion messages/day Sustained peak: 172,000 messages/second written172,000 messages/second written 950,000 messages/second read950,000 messages/second read 367 topics 40 real-time consumers Many ad hoc consumers 9.5TB log retained (~ 6 days) End-to-end delivery time: A few seconds 49. 49 Thanks