services computing technology and system lab ares a high … · 2019-08-12 · services computing...

22
Services Computing Technology and System Lab Cluster and Grid Computing Lab a High Performance and Fault-tolerant Distributed Stream Processing System Ares Changfu Lin [email protected] Joint work with JingJing Zhan, Hanhua Chen, Jie Tan & Hai Jin {zjj, chen, tjmaster, hjin}@hust.edu.cn Cluster and Grid Computing Lab Services Computing Technology and System Lab School of Compute Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China http://grid.hust.edu.cn/

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

Services Computing Technology and System Lab

Cluster and Grid Computing Lab

a High Performance and Fault-tolerant Distributed Stream Processing SystemAres

Changfu [email protected]

Joint work with JingJing Zhan, Hanhua Chen, Jie Tan & Hai Jin{zjj, chen, tjmaster, hjin}@hust.edu.cn

Cluster and Grid Computing LabServices Computing Technology and System LabSchool of Compute Science and TechnologyHuazhong University of Science and Technology, Wuhan, 430074, Chinahttp://grid.hust.edu.cn/

Page 2: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

01

Real-time Stream Processing

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Use CasesE-commerce RecommendationAnomaly Detection

Ecosystem

Page 3: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

02

Real-time Stream Processing

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Use CasesE-commerce RecommendationAnomaly Detection

Requirements

High Availability

Low LatencyExtract value from data streams in real-time

Failures are unavailable for long-time running applications

Page 4: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

03

Low Latency Vs. High Availability

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Net

wor

k To

polo

gy

1

2

O1

3

4

O2

5

6

O3

8

O4

7

App

licat

ion

Topo

logy

Page 5: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

04

Low Latency Vs. High Availability

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Net

wor

k To

polo

gy

1

2

O1

3

4

O2

5

6

O3

8

O4

7

App

licat

ion

Topo

logy

Low Latency

1 5

Node A

Latency

DEBS’13, CIKM’14, ICDCS’14, Middleware’15, INFOCOM’16

Elaborated task allocation schemes

Co-locate upstream and downstream task pairs

Page 6: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

05

Low Latency Vs. High Availability

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Net

wor

k To

polo

gy

1

2

O1

3

4

O2

5

6

O3

8

O4

7

App

licat

ion

Topo

logy

Low Latency

1 5

Node A

Latency

Recovery

DEBS’13, CIKM’14, ICDCS’14, Middleware’15, INFOCOM’16

Elaborated task allocation schemes

downstream task must wait upstream task

Rack 1

1 5

5 8

3

5

Cascaded waiting

5

8

1 3 4Co-locate upstream and downstream task pairs

Page 7: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

06

Challenge: Exploit Task Dependency

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Net

wor

k To

polo

gy

1

2

O1

3

4

O2

5

6

O3

8

O4

7

App

licat

ion

Topo

logy

Recovery

Latency1

Rack A

5

Rack B

1 5

Node A

Latency

Recovery

High Availability

Best trade-off between low latency and high availability via exploiting task dependency?

Low Latency

Page 8: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

07

Ares’s Stream Latency Model

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Idea: Divide the application topology into multiple source-sink paths

1

2

O1

3

4

O2

5

6

O3

8

O4

7

Page 9: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

08

Ares’s Stream Latency Model

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Idea: Divide the application topology into multiple source-sink paths

Source

Sink

# source-sink path2 ∗ 3 + 2 ∗ 3 = 12

1

2

O1

3

4

O2

5

6

O3

8

O4

7

Page 10: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

09

Ares’s Stream Latency Model

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Idea: Divide the application topology into multiple source-sink paths

Source

Sink

latency(1→5→8)= 0.1 + 0.4 + 0.5 + (0.3 + 0.2) = 1.4

1 5 8

A B Ct1

t5

t8

0.1 0.3

0.5

0.2

0.3

0.4 0.7

0.2 0.5

Node A Node B Node C

Processing Time Transferring Time

A

B

C

0.3

0.1

0.2

85

# source-sink path2 ∗ 3 + 2 ∗ 3 = 12

1

1 ( )| | p P

latency pP ∈∑

1

2

O1

3

4

O2

5

6

O3

8

O4

7

Page 11: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

10

Ares’s Stream Recovery Model

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Idea: Exploit the dependency between upstream and downstream tasks for the rack failure

Task dependency is a main challenge for failure recovery.[NSDI’16,StreamScope]

Page 12: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

11

Ares’s Stream Recovery Model

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Rack 1

Switch 1

Rack 3

Switch 3

Rack 2

Switch 2

Idea: Exploit the dependency between upstream and downstream tasks for the rack failure

1 5

1 6

3

5 recovery(rack 3)= 𝑐𝑐(𝑤𝑤15 +𝑤𝑤16 +𝑤𝑤35 + 𝑤𝑤58)

𝑤𝑤58 =4

12= 0.33

Task dependency is a main challenge for failure recovery.[NSDI’16,StreamScope]

1

2

O1

3

4

O2

5

6

O3

8

O4

7

The recovery time is proportional to the sum of weights of task pairs

1 ( )| | r R

recovery rR ∈∑

Page 13: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

12

Fault Tolerant Scheduler (FTS) Problem

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Definition: Given an application topology G(V, E), a task set T, a node set N, a rack set R, and a network topology ψ, find a task allocation π that minimizes the allocation cost u(π).

1 1( ) ( ) (1 ) ( )| | | |p P r R

u W latency p W recovery rP R

π∈ ∈

= + −∑ ∑

, ( ) ( )( ) ( , ) ( , )

ij ij i j

t t ij i j ij ijt T e E e E

t d wψ π ψ π

µ π α λ π β π π γ∈ ∈ ∈ =

= + +∑ ∑ ∑

processing cost transferring cost recovering cost

Gen

eral

izat

ion

Page 14: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

13

The Nirvana algorithm

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Think like a player: When we can’t solve a problem from a Holisticperspective, why not solve it from an individual perspective.

Observation: the allocation decision of a task only depends on the decisions of its neighbor tasks

Game Theory Players

Strategies

Cost Function

FTS Game The task set T

The node set N

Individual allocation cost 𝜃𝜃𝑡𝑡(𝜋𝜋)

, ( ) ( )( ) ( , ) ( , )

ij ij i j

t t ij i j ij ijt T e E e E

t d wψ π ψ π

µ π α λ π β π π γ∈ ∈ ∈ =

= + +∑ ∑ ∑

( ) ( ), ( ) ( )

1 1( ) ( , ) ( , )2 2

i t

t t t ij i t it iti neighbor t i neighbor t

t d wψ π ψ π

θ π α λ π β π π γ∈ ∈ =

= + +∑ ∑

( ) ( )tt T

µ π θ π∈

=∑

Page 15: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

14

The Nirvana algorithm

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Think like a player: When we can’t solve a problem from a holisticperspective, why not solve it from an individual perspective.

Observation: the allocation decision of a task only depends on the decisions of its neighbor tasks

?

Page 16: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

15

Theoretical Analysis Results

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Theorem 1There exists Nash equilibrium for the FTS game.

Theorem 2Our design achieves a 2-approximation ratio for the FTS problem.

Theorem 3The upper bound of the number of rounds to converge to the Nash

equilibrium for the FTS game is h(X+Y+Z).

Theorem 4The computation complexity of our design is 𝑂𝑂(2𝜉𝜉|𝑁𝑁|( 𝑇𝑇 + 2|𝐸𝐸|)),

where 𝜉𝜉 denotes the number of rounds to converge to Nash equilibriumfor the FTS game.

Page 17: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

16

The Ares Architecture

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Sche

dulin

g So

lutio

n

Database

Coordinator

Worker Node

ExecutorExecutor

Executor

Worker process

Supervisor

Load Monitor

Worker Node

ExecutorExecutor

Executor

Worker process

Supervisor

Load Monitor

Worker Node

ExecutorExecutor

Executor

Worker process

Supervisor

Load Monitor

...

Master Node

Custom Scheduler

Nirv

ana-

base

dCo

ntro

l

Nirvana Agent

Dist

ribut

ed S

tream

Pro

cess

ing

Syste

m

Page 18: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

17

Evaluation

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Setups30 node cluster, implement Storm 16 cores & 64GB DDR3 Intel Xeon server1Gbps Ethernet interfacebaseline: R-Storm[Middleware’15]

ApplicationWord Count Application: Follow the setting of Heron[SIGMOD’15]

Join Application: TPC-H dataset

MetricsThroughputAverage Tuple Processing TimeAverage Rack Recovery Time

Page 19: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

18

Is Processing Latency Better?

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Average Reduction of 50.2%

Page 20: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

19

Is Recovery Time Better?

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Average Reduction of 48.9%

Page 21: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

20

Is Throughput Better?

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

Average Improvement of 2.24×

Page 22: Services Computing Technology and System Lab Ares a High … · 2019-08-12 · Services Computing Technology and System Lab. Cluster and Grid Computing Lab. a High Performance and

21

Summary

Cluster and Grid Computing LabServices Computing Technology and System Lab Ares

The fault tolerant scheduling problem

The Nirvana algorithm

Implementation of Ares on top of Storm

Based on best-response dynamics

The stream latency modelThe stream recovery model

Thank you! Any question?