real-time analytics with netty, apache kafka and storm · pdf filereal-time analytics with...

13
Real-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” http://nguyentantrieu.info Update: 07/06/2013

Upload: dinhdang

Post on 11-Feb-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Real-time Analytics with Netty, Apache Kafka and StormCase study with “lambda architecture”

http://nguyentantrieu.infoUpdate: 07/06/2013

Page 2: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Agenda1. Overview Architecture2. Log HTTP-Handler and producer: Netty 43. Kafka 0.8 (Stream Data Log Storage)4. Storm Analytics Cluster

Page 3: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Overview System Architecture

Page 4: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Concept Flow

Page 5: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Concept FlowJavaScript Tracking

Mobile SDK

Http Log Server

Kafka

Page 6: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

S2 HTTP Log Servernetty framework 4

Page 7: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Netty.ioNetty is a non-blocking I/O (NIO) client-server framework for the development of Java network applications such as protocol servers and clients. The asynchronous event-driven network application framework and tools are used to simplify network programming such as TCP and UDP socket servers.[2] Netty includes an implementation of the reactor pattern of programming.

http://en.wikipedia.org/wiki/Netty_(software)http://nguyentantrieu.info/blog/backend-system-with-netty-io

Page 8: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Apache Kafka (version 0.8)https://cwiki.apache.org/confluence/display/KAFKA/Index

Page 9: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

In Production● Clustering ( 4 nodes)● Partitions

○ user-activity: 24 partitions● Producer and consumer are replication

aware

Page 10: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Core Producer Class (S2 HTTP Log Server)

Page 11: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

Storm Analytics Cluster

Page 12: Real-time Analytics with Netty, Apache Kafka and Storm · PDF fileReal-time Analytics with Netty, Apache Kafka and Storm Case study with “lambda architecture” Update: 07/06/2013

The Storm Topology

Kafka Clustertopic: user-activity

Tokenizer Bolt

Parser Bolt

Aggregate Bolt

Redis Statistics Bolt

Save DWH Bolt

Raw Data

Kafka Consumer Spout