Kafka Audit - Kafka Meetup - January 27th, 2015

Download Kafka Audit - Kafka Meetup - January 27th, 2015

Post on 15-Jul-2015

870 views

Category:

Technology

0 download

Embed Size (px)

TRANSCRIPT

<p>Kafka AUDITJanuary 27th, 2015 - LinkedIn MeetupSRE at LinkedIn.What I doNEXT SLIDE - KAFKA + LINKEDIN</p> <p>+Kafka Audit is what this is about.How Kafka used at LinkedInNEXT SLIDE - BASIC MESSAGEProducerKafka ClusterLocalMpMp = {Plain old Kafka message}Producer tier vs Local tierNEXT SLIDE - ADD AGGREGATEProducerKafka ClusterLocalKafka Cluster AggregateMpMpMp = Plain old Kafka messageProducer tierLocal tierAggregate tier WHY!!NEXT SLIDE - ADD DATA CENTERProducerKafka Cluster AggregateMpMpKafka Cluster AggregateKafka ClusterLocalKafka ClusterLocalMpMpMp = Plain old Kafka message{Datacenter A{Datacenter BAdd another datacenter!Added to explain why Aggregate existsNEXT SLIDE REMOVES DATA CENTER BProducerKafka Cluster AggregateMpMpKafka ClusterLocalMp = Plain old Kafka messageTRANSITION TO ADDING AGGREGATE OFFLINEQUICKProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateMpMpMpMp = Plain old Kafka messageSay all the tiers!AGGREGATE OFFLINE IS IN A NEW DATA CENTERNEXT SLIDE - HADOOPProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateOffline processingMpMpMpMpMp = Plain old Kafka messageSuch as HadoopMention pipeline to deploy dataNEXT SLIDE IS ABOUT AUDIT MESSAGESPAUSE HERE!!ProducerKafka ClusterMa = {MaPlain old Kafka messageProducer creation timestampProducer identification string}Make a few changes to get Audit working.Special producerAdd special data to every messageNEXT SLIDE IS MONITORING MESSAGEProducerKafka ClusterMa = {MaPlain old Kafka messageProducer creation timestampProducer identification string}MmMm = {Count of messagesThe topic this count is forTier identification stringTime bucket interval}Periodically, we send another message.Contains info relevant to time intervalCount of messagesNEXT SLIDE IS OVERVIEW OF TIERS WITH AUDITProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateOffline processingMaMmMaMaMaMa = Message with audit dataMm = Monitoring messageNEXT SLIDE ADDS SINGLE AUDIT CONSUMERProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateOffline processingAudit ConsumerMaMmMmMaMaMaMaMa = Message with audit dataMm = Monitoring messageNEXT SLIDE MULTIPLE AUDIT CONSUMERSProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateOffline processingAudit ConsumerAudit ConsumerAudit ConsumerMaMmMmMaMmMaMmMaMaMaMaMa = Message with audit dataMm = Monitoring messageNEXT SLIDE AUDIT APPProducerKafka ClusterLocalKafka Cluster AggregateOffline processingAudit ConsumerAudit ConsumerAudit ConsumerAuditAppMaMmMmMaMmMaMmMaMaMaMmMa = Message with audit dataMm = Monitoring messageMaKafka Cluster AggregateNEXT SLIDE IS UI AND DBProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateOffline processingAudit ConsumerAudit ConsumerAudit ConsumerAuditAppREST APIMaMmMmMaMmMaMmMaMaMaMmMa = Message with audit dataMm = Monitoring messageMaMmAudit MySQLAudit UINEXT SLIDE IS AUDIT INTERFACE TABLEAudit UITierCountLocal123AggregateAggregate Offline123123Producer123(for each topic and time window)NEXT SLIDE SHOWS LOST MESSAGESAudit UITierCountLocal123AggregateAggregate Offline119119Producer123We lost 4 messages between local and aggregate!(for each topic and time window)NEXT SLIDE - CAVEATSCaveatsAudit consumers need to consume everything.Intermediate tiers are tough to drill down into.NEXT SLIDE - QUESTIONSQuestions?users@kafka.apache.orghttps://kafka.apache.org/irc://irc.freenode.net/#apache-kafkaMany folks on the mailing list know the detailsof how Kafka Audit works.PEOPLE IN THIS ROOM WROTE KAFKA AUDITNEXT SLIDE - BLANKLate message resolutionTHIS TENDS TO CONFUSE MOST PEOPLELate message resolutionProducerLocalAggregateAggregateHadoop10:1010:2010:3010:4034110:00341341341341352299299299299337337337337337326326326326326From the 10:10 to 10:20 time bucket, 53 messages werelost from the producer to the Kafka local cluster.Unhealthy!Current timeNEXT SLIDE - MONITORING MESSAGES COMES IN WITH MISSING MESSAGESLate message resolutionProducerLocalAggregateAggregateHadoop10:1010:2010:3010:4034110:00341341341341352299+53299299299337337337337337326326326326326Another message Mm arrives later with the missing count of 53!Current timeNEXT SLIDE - RESOLUTIONLate message resolutionProducerLocalAggregateAggregateHadoop10:1010:2010:3010:4034110:00341341341341352352352352352337337337337337326326326326326All time periods match after arrival of late Mm message.Healthy state now.Current timeNEXT SLIDE - MONITORING MESSAGE OVERVIEWThe producer timestamp determines the time bucketthe message is placed into deterministic.Mm = {Count of messagesThe topic this count is forTier identification stringTime bucket interval}DETERMINISTICNEXT SLIDE - BLANKTransport timeDESCRIBE WHAT TRANSPORT TIME IS!NEXT SLIDE - ARCHITECTUREProducerKafka ClusterLocalKafka Cluster AggregateKafka Cluster AggregateAudit ConsumerAudit ConsumerAudit ConsumerMaMaMaMaMaMaTt = {Time Ma seen by audit consumer}Topic nameTtTtTtMetrics(e.g. RRDs)NEXT SLIDE - HOW COMPUTEDTt = {Time seen by audit consumer}Topic nameTt can be sampled,no need to emit for all messagesTt[time] = - Ma[time]NEXT SLIDE - CAVEATSCaveatsDepends on the Audit Consumer lag.Producer batching can skew timestamps.NEXT SLIDE - BLANKSchema resolutionNEXT SLIDE - SCHEMA EXAMPLEWhat is a schema?{ "type":"record", "name":"User", "fields":[ { "name":"name", "type":"string" }, { "name":"favorite_number", "type":[ "int", "null" ] } ]}Every message should be formatted to a schema!NEXT SLIDE - INTRO SCHEMA REGISTRYSchema registryA REST API to go from schema to ID, and ID to schema.Schema ID = hash(Raw Schema)Schema Registry DatabaseRegistrationTimestampSchema IDRaw SchemaHistory of registrations is maintained.NEXT SLIDE - REGISTRATION FLOWProducerSchema Registry1.2.Kafka3.Producer registers schema.Registry returns schema ID (hash of schema).Schema ID prepended to all Kafka messages.Ms = {} + MallMsNEXT SLIDE - BLANK</p>

Recommended

View more >