1 supporting dynamic migration in tightly coupled grid applications liang chen qian zhu gagan...

33
1 Supporting Dynamic Migration in Tightly Coupled Grid Applications Liang Chen Qian Zhu Gagan Agrawal Computer Science & Engineering The Ohio State University

Upload: stewart-moody

Post on 13-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

1

Supporting Dynamic Migration in Tightly Coupled Grid

Applications

Liang ChenQian Zhu

Gagan Agrawal Computer Science & Engineering

The Ohio State University

2

Introduction-Motivation– Grid resources vary frequently– Tightly coupled applications in Grid

• v.s. bag of tasks• Pipelined applications and streaming applications• Features:

– Dependencies– Run-longing– Large volumes of data transfer between tasks (stages)

– Dynamically allocating new resources and migrating applications to the new resources improve performance

weili lin
grid is first developed to enable resource sharing within far-flung scientific collabration. such as colllaborative virsulization of large scientific datasets and distributed computing for highly computaionally demanding data anylysis. just as www began as a technology for scientific coopration and was adopted by e-biness, people expect the same trajectory of grid technologies
weili lin
resource in different data formatresources on different platformsdifferent kinds of resources, like storage resouces, softwars, data and the likesome sharing relationship is transient. it could be because of the upgrade of resources

3

Introduction-Challenges– Checkpointing is a classic method to support d

ynamic migration• A snapshot of system’s running state• Transmit to a remote site• Restore execution context and restart processes

– Pros of checkingpointing• Maybe transparent to applications

– Cons of checkingpointing• Platform dependent• Inefficient

4

Introduction-Our approach– Typical processing structure of a data streaming application:

...

while(true){

read_data_from_streams();process_data();accumulate_intermediate_results();reset_auxiliary_structures();

}...

– Our approach is based on Light-weight Summary Structure (LSS)

Data structure storing summary information is Light-weight summary structure

Others are Auxiliary structures

5

Introduction - Contribution Proposed the notion of LSS that enables effic

ient process migration

Implemented application migration using LSS in the GATES middleware

Designed a dynamic resource allocation algorithm for pipeline processing on streaming data

Demonstrated an architecture for resource monitoring and allocation

Extensively evaluated the LSS implementation using 3 data stream applications

6

Middleware System Architecture

• Features of data steam– Data arrive continuously – Enormous volume and must be processed online

– Need to be processed in real-time– Data sources could be distributed

• The needs for processing distributed data streams– A middleware running in Grid– Allocate Grid resources– Provide self-adaptation function

7

Middleware System Architecture

– GATES (Grid-Based AdapTive Execution on Streams) middleware

• Use Globus Toolkit 3.0, built on OGSA

• Allows users to specify their algorithms implemented in Java

• Take care of plugging user-defined algorithms into the system and running them in Grid.

• Applications need be broken down into a number of pipelined stages

8

A B C

Stage A Stage B Stage C

:GATES services

:Stages of an application :Queues between Grid services

:Buffers for applications

Middleware System Architecture

Application

Stage A

Stage B

Stage C

9

Public class Second-Stage implements StreamProcessing{ … void work(buffer in, buffer out) {

… while(true) { DATA = GATES.getFromInputBuffer(in); Inter-Results = Processing(Data); GATES.putToOutputBuffer (out, Inter-Results); }

}}

System Architecture and Design

(GATES API Functions)

10

Roadmap Introduction

‒ Motivation for tight-coupled applications in Grid‒ Challenge and our approach

Middleware System Overview‒ Introduce the system architecture and design

Implementing Dynamic Migration Using LSS‒ Light-weight summary structure (LSS) and its

example‒ Advantages of utilizing LSS‒ LSS Implementation Detail‒ Architecture of dynamic resource allocation

scheme Evaluation

‒ Three distributed data stream applications‒ Memory usage of LSS‒ Efficient migration by using LSS‒ Processing accuracy and LSS migration

Related work Conclusion

11

Light-weight Summary Structure (LSS) & its

Example• LSS is a data structure that stores

summary information of processing• Auxiliary structures • An application calculates the average

value of all integer numbers in a stream– Two stage:

• the first is data source• the second calculates the sum and counts the

number of integers, ave=sum/count

– LSS would be the sum and the count– Auxiliary structures would be loop index

and other temporary variables

12

Advantages of using LSS

• Efficient, only LSS is migrated– Only “sum” and “count” migrate

• Not impact the accuracy of processing

• Support migration across heterogeneous platforms– “sum” and “count” are logic structures

• Reduce application developers’ efforts on making application capable of migration

13

An Example of LSS• LSS can be used to support dynamic

migration– GAETS provides an API function to

allocate memory to be LSS– An application stores summary

information to LSS– transmit only LSS at the end of the loop

to a new node– Restore the LSS at the new node

14

Public class Second-Stage implements StreamProcessing{ … void work(buffer in, buffer out) {

while(true) { DATA = GATES.getFromInputBuffer(in); Inter-Results = Processing(Data);

GATES.putToOutputBuffer (out, Inter-Results);

} }}

Application using LSS

LSS = Get a LSS from GATES

Accumulate Inter-Results to LSS

Reset all Auxiliary structures Inform GATES migration could be executed

15

Implementation Detail

16

Architecture of dynamic resource allocation scheme

• Using Information Service to collect resource information

• Apply dynamic resource allocation algorithm

• Advise and assist GATES services to migrate

17

Roadmap Introduction

‒ Motivation for tight-coupled applications in Grid‒ Challenge and our approach

Middleware System Overview‒ Introduce the system architecture and design

Implementing Dynamic Migration Using LSS‒ Light-weight summary structure (LSS) and its

example‒ How applications utilize LSS‒ LSS Implementation Detail‒ Architecture of dynamic resource allocation

scheme Evaluation

‒ Three distributed data stream applications‒ Memory usage of LSS‒ Efficient migration by using LSS‒ Processing accuracy and LSS migration

Related work Conclusion

18

Experimental Evaluation

• Evaluation– Three applications

• Counting sample– LSS stores intermediate top M frequently occurrin

g numbers• Clustream, clustering data points in streams

– LSS stores micro-clusters computed at the second stage

• Dist-Freq-Counting, finding frequent itemsets in distributed streams.

– LSS stores unprocessed itemsets

19

Experimental Evaluation

• Memory usage of LSS

20

Experimental Evaluation

• Memory usage of LSS

21

Experimental Evaluation

• Migration using LSS is efficient

22

Experimental Evaluation

• Migration using LSS is efficient

23

Experimental Evaluation

• Migration using LSS is efficient

24

Experimental Evaluation

• Migration using LSS is efficient

25

Experimental Evaluation

• Benefits of migration in a dynamic environment

26

Experimental Evaluation

• Benefits of migration in a dynamic environment

27

Experimental Evaluation

• LSS migration does not impact processing accuracy– The counting sample application was

used– Compared the average accuracy of

the processing results from the non-migration and the migration versions, they are 97.28% and 97.51% accurate

28

Related work Condor XCATS Charm++ User Level Processes (ULP) Migratable PVM Piranha

29

Related Work• Middleware for data stream processing

– Data cutter, Stampede– Differences: in a cluster, no self-adaptation, no specificall

y for real-time processing• Continuous query systems

– STREAM, dQUOB, TelegraphCQ, NiagraCQ– Differences: centralized, no adaptation supports

• Distributed continuous query systems– Aurora*, Medusa, Borealis– Differences: continuous queries, not in Grid environment

• In-Network aggregation in sensor network• Stream-based overlay networks

30

Conclusion LSS enables efficient migration for

distributed data stream applications The main observations from our

experiments – Enables efficient process migration; the size

of process state reduced by 30-120 times – Introduces a very small overhead– Significantly improve the performance of

long-running applications. – Our migration scheme does not impact the

accuracy of the processing.

31

Questions?

32

Implementing Dynamic Migration Using LSS

... while(true) { ... //check if migration is needed

if(GATES.ifMigrationNeeded()) { GATES.migrate(lss); break; } }

Codes running atRemote Computing Node

33

Architecture of Dynamic Resource Allocation

Scheme