samza at linkedin: taking stream processing to the next level

39
Apache Samza: Taking stream processing to the next level Martin Kleppmann — @martinkl

Upload: martin-kleppmann

Post on 21-Apr-2017

3.306 views

Category:

Engineering


3 download

TRANSCRIPT

Apache Samza

Apache Samza: Taking stream processingto the next levelMartin Kleppmann @martinkl

Martin KleppmannHacker, designer, inventor, entrepreneur

Co-founded two startups, Rapportive LinkedInCommitter on Avro & Samza ApacheWriting book on data-intensive apps OReillymartinkl.com | @martinkl

Apache Kafka

Apache Samza

Apache KafkaApache Samza

Credit: Jason Walsh on Flickrhttps://www.flickr.com/photos/22882695@N00/2477241427/Credit: Lucas Richarz on Flickrhttps://www.flickr.com/photos/22057420@N00/5690285549/

Things we would like to do(better)

Provide timely, relevant updates to your newsfeed

Update search results with new information as it appears

Real-time analysis of logs and metrics

Tools?Response latencyKafka & SamzaMilliseconds to minutesLoosely coupledRESTSynchronousClosely coupledHours to daysLoosely coupled

9

Service 1Kafkaevents/messagesAnalyticsCache maintenanceNotificationssubscribesubscribesubscribepublishpublishService 2

Publish / subscribeEvent / message = something happenedTracking:User x clicked y at time zData change:Key x, old value y, set to new value zLogging:Service x threw exception y in request zMetrics:Machine x had free memory y at time z

Many independent consumersHigh throughput (millions msgs/sec)Fairly low latency (a few ms)

Kafka at LinkedIn350+ Kafka brokers8,000+ topics140,000+ Partitions

278 Billion messages/day49 TB/day in176 TB/day out

Peak Load4.4 Million messages per second6 Gigabits/sec Inbound21 Gigabits/sec Outbound

public interface StreamTask {void process(IncomingMessageEnvelope envelope,MessageCollector collector,TaskCoordinator coordinator);}Samza API: processing messagesgetKey(), getMsg()commit(), shutdown()sendMsg(topic, key, value)

Also provide interfaces for windowing tasks that are called specific amounts of timeAlso provide methods for initialization, configuration, etc.Checkpointing is handled behind the scenes13

Familiar ideas from MR/Pig/Cascading/Filterrecords matching conditionMaprecord func(record)Jointwo/more datasets by keyGrouprecords with the same value in fieldAggregate records within the same groupPipejob 1s output job 2s input

MapReduce assumes fixed dataset.Can we adapt this to unbounded streams?

Operations on streamsFilterrecords matching condition easyMap record func(record) easyJointwo/more datasets by keywithin time window, need bufferGrouprecords with the same value in fieldwithin time window, need bufferAggregaterecords within the same group ok when do you emit result?Pipejob 1s output job 2s input ok but what about faults?

Stateful stream processing(join, group, aggregate)

Joining streams requires stateUser goes to lunch click long after impressionQueue backlog click before impressionWindow joinJoin and aggregateClick-through rate

Key-valuestoreAd impressionsAd clicks

Remote state or local state?Samza job partition 0Samza job partition 1

e.g. Cassandra, MongoDB, 100-500k msg/sec/node100-500k msg/sec/node1-5k queries/sec??

Remote state or local state?Samza job partition 0Samza job partition 1

LocalLevelDB/RocksDBLocalLevelDB/RocksDB

Another example: Newsfeed & followingUser 138 followed user 582User 463 followed user 536User 582 posted: Im at Berlin Buzzwords and it rocksUser 507 unfollowed user 115User 536 posted: Nice weather today, going for a walkUser 981 followed user 575

Expected output: inbox (newsfeed) for each user

Newsfeed & followingFan out messages to followersDelivered messages

582 => [ 138, 721, ]Follow/unfollow eventsPosted messagesUser 582 posted: Im at BerlinBuzzwords and it rocksUser 138 followed user 582Notify user 138: {User 582 posted:Im at Berlin Buzzwords and it rocks}Push notifications etc.

Local state:

Bring computation and datatogether in one place

Fault tolerance

KafkaKafkaYARN NodeManager

YARN NodeManager

YARNRMSamza Container

Samza Container

Samza Container

Samza Container

Machine 1Machine 2

Task

Task

Task

Task

Task

Task

Task

TaskBOOM

KafkaKafkaYARN NodeManager

YARN NodeManager

YARNRMSamza Container

Samza Container

Samza Container

Samza Container

Machine 1Machine 2

Task

Task

Task

Task

Task

Task

Task

TaskBOOM

KafkaYARN NodeManager

Machine 3

YARN NodeManager

Samza Container

Samza Container

KafkaYARN NodeManager

Samza Container

Samza Container

Machine 2

Task

Task

Task

TaskKafkaMachine 3

Task

Task

Task

Task

Fault-tolerant local stateSamza job partition 0Samza job partition 1

LocalLevelDB/RocksDBLocalLevelDB/RocksDBDurable changelogreplicate writes

YARN NodeManager

Samza Container

Samza Container

KafkaYARN NodeManager

Samza Container

Samza Container

Machine 2

Task

Task

Task

TaskMachine 3

Task

Task

Task

Task

Kafka

Samzas fault-tolerant local stateEmbedded key-value: very fastMachine dies local key-value store is lost

Solution: replicate all writes to Kafka!Machine dies restart on another machineRestore key-value store from changelogChangelog compaction in the background (Kafka 0.8.1)

When things go slow

Owned byTeam XTeam YTeam ZCascades of jobsJob 1Stream BStream AJob 2Job 3Job 4Job 5

Consumer goes slowBackpressureQueue upDrop dataOther jobs grindto a halt Run out ofmemory Spill to diskOh wait Kafka does this anyway!

No thanks

Job 1Stream BStream AJob 2Job 3Job 4Job 5Job 1 outputJob 2 outputJob 3 output

Samza always writesjob output to KafkaMapReduce always writesjob output to HDFS

Every job output is a named streamOpen: Anyone can consume itRobust: If a consumer goes slow, nobody else is affectedDurable: Tolerates machine failureDebuggable: Just look at itScalable: Clean interface between teamsClean: loose coupling between jobs

ProblemSolutionNeed to buffer job outputfor downstream consumersWrite it to Kafka!Need to make local statestore fault-tolerantWrite it to Kafka!Need to checkpoint jobstate for recoveryWrite it to Kafka!Recap

Apache Kafka

Apache Samzakafka.apache.orgsamza.incubator.apache.org

Hello Samza (try Samza in 5 mins)

Thank you!Samza:Getting started:samza.incubator.apache.orgUnderlying thinking:bit.ly/jay_on_logsStart contributing:bit.ly/samza_newbie_issues

Me:Twitter:@martinklBlog:martinkl.com