kafka overview and use cases

13
SITE RELIABILITY ENGINEERING ©2016 LinkedIn Corporation. All Rights Reserved. Kafka - Overview Indrajeet Kumar Site Reliability Engineer at LinkedIn

Upload: indrajeet-kumar

Post on 26-Jan-2017

353 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Kafka - Overview

Indrajeet KumarSite Reliability Engineer at LinkedIn

Page 2: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. 2

So what is it?

It is a high-throughput, low-latency messaging system

Page 3: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

And who uses it?

3

Page 4: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

What for?•Messaging

•Website Activity Tracking

•Metrics

•Log Aggregation

•Stream Processing

•For fun ;)

4

Page 5: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

So how does it work?

▪Components– Producer– Broker

▪Topic▪Partition

– Consumer

5

Page 6: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Broker

producer producer producer

B2B1P1 P2P1R P2

R

6

consumer consumer consumer

Page 7: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

The consumer

7

consumer

B2B1P1 P2B3P3

C1 C2

P1R

Page 8: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

The Producer

8

Producer

B2B1P1 P2B3P3

P1 P2

P1R

Page 9: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Attributes of a Kafka Cluster

▪Durable

▪Scalable

▪Low Latency

▪Finite Retention

▪No single point of failure

9

Page 10: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Kafka At LinkedIn

▪Multiple Datacenters, Multiple Clusters

▪Mirroring between clusters

▪Message Types– Metrics– Tracking– Queuing

▪Data transport from applications to Hadoop, and back

10

Page 11: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Some numbers!

▪1800+ Broker machines▪79K+ Topics▪1.1M+ Partitions

▪1.3 Trillion messages per day▪330 Terabytes in/day▪1.2 Petabytes out/day

▪Peak load for a single cluster– 2 million messages/sec– 4.7 Gigabits/sec inbound– 15 Gigabits/sec outbound

11

Page 12: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Questions

12

Page 13: Kafka overview and use cases

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.