applying control theory to stream processing systems wei xu ([email protected])[email protected]...

23
Applying Control Theory to Stream Processing Systems Wei Xu ([email protected] ) Bill Kramer ([email protected] ) Joe Hellerstein ( hellers@us. ibm.com )

Post on 22-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Applying Control Theory to Stream Processing Systems

Wei Xu ([email protected])Bill Kramer ([email protected])

Joe Hellerstein ( [email protected] )

Description of the system

TCQComplex internal structure

Input BufferData Source

TCQ drops tuples silently if result queue is full

Why do we need control?• Data source does not provide accurate data rate

0 1 2 3 4 5 6

x 105

0

500

1000

1500

2000

2500

3000

3500

4000

4500

time (ms)

num

ber

of t

uple

s pe

r se

c

desired load

actual load

Why do we need control?• TCQ node drops tuples when result queue fill up

0 1 2 3 4 5 6 7 8 9

x 105

0

1000

2000

3000

time (ms)

tup

les p

er

se

c

source data rateend-to-end drop rate

0 1 2 3 4 5 6 7 8 9

x 105

0

1000

2000

3000

time (ms)

tup

les p

er

se

c

output rate of buffer

0 1 2 3 4 5 6 7 8 9

x 105

0

2

4

6x 10

5

time (ms)

fre

e s

pa

ce

(K

B)

free space

Buffer

Source

TCQ

Result Q

Control Problems

• Providing an accurate data source– Get the actual data rate

• Regulate queue length on TCQ node– Prevent dropping tuples – Maximize throughput (and adapts when distur

bance happens)

System with Control

Output Rate Controller

ControlledData Source

Queue Length Monitor

2

The Control Architecture

P Controller

PI Controller

Joseph L Hellerstein
I don't understand the direction of the arrows for the reference, error, and control inputs

Result – An accurate data source

P Controller with Pre-compensation PI Controller

0 2 4 6 8 10

x 105

0

500

1000

1500

2000

2500

3000

time (ms)

tup

les

pe

r se

c

desired loadactual load

0 2 4 6 8 10

x 105

0

500

1000

1500

2000

2500

3000

time (ms)

tup

les

pe

r se

c

desired loadactual load

Joseph L Hellerstein
But it's a P controller with precompensation

Result – regulating queue length

Buffer

Source

TCQ

Result Q

0 1 2 3 4 5 6 7 8 9 10

x 105

0

1000

2000

3000

time (ms)

tup

les

pe

r se

csource data rateend-to-end drop rate

0 1 2 3 4 5 6 7 8 9 10

x 105

0

1000

2000

3000

time (ms)

tup

les

pe

r se

c

output rate of buffer

0 1 2 3 4 5 6 7 8 9 10

x 105

2

4

6x 10

5

time (ms)

fre

e s

pa

ce (

KB

)

free space

Joseph L Hellerstein
Can you explain the spikes?

Result – Under CPU Contention

Buffer

Source

TCQ

Result Q

0 1 2 3 4 5 6 7 8 9

x 105

0

1000

2000

3000

time (ms)

tup

les

pe

r se

csource data rateend-to-end drop rate

0 1 2 3 4 5 6 7 8 9

x 105

0

1000

2000

3000

time (ms)

tup

les

pe

r se

c

output rate of buffer

0 1 2 3 4 5 6 7 8 9

x 105

0

2

4

6x 10

5

time (ms)

fre

e s

pa

ce (

KB

)

free space

Joseph L Hellerstein
It would be good to go back to the control diagram to show how CPU contention relates to disturbances
Joseph L Hellerstein
I'm not sure I fully understand what happened here.

Why theory is useful?• One of my implementations .. What happened?

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 105

0

2000

4000

time (ms)

tup

les p

er

se

c

source data rateend-to-end drop rate

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 105

0

5000

10000

time (ms)

tup

les p

er

se

c

output rate of buffer

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 105

0

2

4

6x 10

5

time (ms)

fre

e s

pa

ce

(K

B)

free space

Buffer

Source

TCQ

Result Q

What is going on?

Queue LengthController

Desired Queue length

Actual Queue Length

Data Rate to TCQControlled

Output Thread

(Code Reuse)

Theory meets reality

Output Y from simulation

Time

Que

ue

leng

th

Tricky part of parameter estimation

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5

x 105

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5x 10

5

u: number of tuples per sec

y: f

ree

spac

e on

que

ue

Model evaluation – Making the system operate in desired range

Non-Linear range

Easy for data source, but queue length ..

0 1 2 3 4 5 6 7 8 9

x 105

0

1

2

3

4

5

6x 10

5

time (ms)

free

spac

e (K

B)

free space

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 105

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5x 10

5

time (ms)

fre

e sp

ace

(K

B)

free space

-1 0 1 2 3 4 5

x 105

-1

0

1

2

3

4

5x 10

5

u: number of tuples per sec

y: fr

ee

spa

ce o

n qu

eue

Free Space

Data rate vs free space

Joseph L Hellerstein
Need a plot that shows why this is non-linear due to the threshold effects

Settling Time and Overshoot matters

P Controller

A lot of small disturbance in a Java programIncremental garbage collection

5 5.1 5.2 5.3 5.4 5.5

x 105

1400

1420

1440

1460

1480

1500

1520

1540

1560

1580

time (ms)

num

ber

of t

uple

s pe

r se

c

desired load

actual load

5.3 5.4 5.5 5.6 5.7 5.8 5.9

x 105

1280

1300

1320

1340

1360

1380

1400

1420

1440

time (ms)

num

ber

of t

uple

s pe

r se

c

desired load

actual load

PI Controller

Conclusion

• Advantages of feedback control– Make system more robust under disturbance– Treat complex systems as black boxes

• Cope with the system characteristics instead of having to change it

– Encourage reporting system statistics– Implementation is easy and has theoretical

guarantees

Future Work

• Load balancer

• Smaller sample time to reduce disturbance caused by Java GC?

• Controller on scheduling of system shared by multiple streams

Backup Slides

Outline

• Problems and Motivation

• Controller design

• Result

• Discussion

Description of the System

Revised

DataSource

Input Buffer

TCQ Node

Queue length

RoutingLogic

Load SplitterTCQ Node

Tuples

Tuples

Tuple Blocks

Operation of Load Splitter1. Arriving blocks wait in Input Buffer2. Tuples are routed to balance TCQ queue lengths3. Stop routing if queue length is too large to avoid tuple discards

Compare to Open Loop Control

We know

Y(k) , and we know what we want y(k+1) to be.. Use transfer function to solve for u(k)…

(Expected result – accuracy and disturbance ) -- do be done

Estimation of the transfer function

-3000 -2000 -1000 0 1000 2000 3000-3000

-2000

-1000

0

1000

2000

3000

u: desired tuples per sec

y: a

ctua

l tup

les

per

sec

y(k+1)=ay(k)+bu(k)

Regression

Joseph L Hellerstein
Make sure that explain the experimental setup

Tricky part of parameter estimation

Model evaluation – A data rate that make it operate in linear range

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 105

0

1000

2000

3000

time (ms)

nu

mb

er

of

tup

les

pe

r s

ec

desired load

actual loadend drop

tcq drop

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 105

0

1000

2000

3000

time (ms)

blo

ck

re

sp

on

se

tim

e

block response time

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 105

0

2

4

6x 10

5

time (ms)

fre

e s

pa

ce

on

re

su

lt q

ue

ue