j storm

23
Jstorm Introduction -- [email protected] Alibaba

Upload: longda-feng

Post on 04-Aug-2015

47 views

Category:

Documents


0 download

TRANSCRIPT

Jstorm Introduction -- [email protected]

Alibaba

Agenda

Difference with Storm

Plan

Current Stats

Alibaba

• Java Storm– More powerful features– More stable– More faster

What’s the JStorm

Alibaba

• JStorm Team was among one of the earliest that uses Storm in China.– Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1– JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…

• Our Duties– Application Development– JStorm System Development– JStorm System Operation

Who we are?

Alibaba

• Storm community is not as active as we’ve expected

– Tailored for enterprise environment– Fixed critical bugs in Storm– Provided professional technical support,

improved app development pace.– Reduced operational cost.

Why start Jstorm?

Alibaba

• Too much requirement drive us move faster– Release 11 version in 2014– Refer to https://github.com/alibaba/jstorm/releases

Evolution speed

Alibaba

• Start design from 2012/02/07• Release first version 0.7.1 2013/04/30

Jstorm history

Alibaba

• Most of powerful Chinese Company

Who are Using Jstorm?

Alibaba

• More than 3000 servers • More than 3 trillion messages per day• More than 300 topology

How big in Alibaba?

Alibaba

• Live Alibaba 11.11 room– Trade amount/count– PV/UV– All kinds of KPI

• The peak volume of JStorm messaged being processed during 11.11,12.12 Shopping Feistivals is ten times as large compared to the peak volume on a normal day.

User Scenario

Alibaba

• Realtime Recommended Ad– Analysis user action, then recommend production

User Scenario

Alibaba

• Log Analysis– Get all kinds of KPI– Monitor– Smart Customer Service– Tlog/EagleEye

User Scenario

Alibaba

• Realtime Data sync pipeline– DB– Log– Message

User Scenario

Alibaba

• 3 Examination every year– 11/11– 12/12– Spring Festival, red package war– Ten throughput peak period

Why Stable?

Alibaba

• Nimbus HA• Support Resource Isolation with Cgroups• Fix bugs under Hadoop-yarn• Monitor every phase of tuple• Tuning GC parameter• Graceful worker shutdown

Improve stability

Alibaba

Alibaba

Faster

• 6 Servers (24core/98G)• 18 Spout/18 Bolt/18 Acker

0 10 20 30 40 50 600

2000000

4000000

6000000

8000000

10000000

12000000

62436806830500

5595900 5474180

3379800

9280598

10818815

9065965

6819139

5610201

Throughput vs workers

jstorm

storm

workers

polltu

ple

s/10s

• Dedicated Deserializing Thread• Dedicated ack/fail thread in Spout• Avoid CPU spin-waiting

• Better Tuned Sampling Logic• Better Tuned Acking Framework• Better Tuned GC• Better Netty RPC framework• Reduce memory-copying by zeroMq

Why faster?

Alibaba

• More powerful scheduler• More powerful metrics system• Support Classloader• More convenient Web UI/LogView• Support sync mode for Netty RPC frameworker• New transaction programming mode• Self-adaption speed

More features

Alibaba

• More than 100 improvements– https://github.com/alibaba/JStorm/blob/master/history.m

d

More details

Alibaba

• Make evolution faster– Full time developer– Full time tester– Hundreds of application which can test new feature

quickly– Java core will bring more developer

What can we bring?

Alibaba

• Provide programming framework liking Trident

Import new plugin

Alibaba

• One year later, maybe we will open source our SQL engine

SQL Engine

Alibaba

• We are going to port some Spark feature to our system.

Port Spark’s feature

Alibaba