1. pragma grid test-bed : shares clusters which managed by multiple sites realizes a large-scale...

19
Towards a Virtual Cluster Over Multiple Physical Clusters Using Overlay Network 1 PRAGMA20 2-4 March 2011 Kei Kokubo, Yuki Fujiwara Kohei Ichikawa, Susumu Date Osaka University Adrian Ho, Jason Haga University of California, San Diego

Upload: cordelia-wiggins

Post on 29-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Towards a Virtual Cluster Over Multiple Physical Clusters

Using Overlay Network

1

PRAGMA20      2-4 March 2011

   Kei Kokubo, Yuki Fujiwara

Kohei Ichikawa, Susumu DateOsaka University

Adrian Ho, Jason HagaUniversity of California, San Diego

Background PRAGMA Grid test-bed :

Shares clusters which managed by multiple sitesRealizes a large-scale computational environment.

 

› Expects as a platform of computational intensive applications. Highly independent processes which can be

distributed.Ex) Docking simulation.

22http://www.rocksclusters.org/rocks-register/

Site CSite B

Site A

Large-scaleEnvironment

Grid Environment

OS:Debian      lib: glibc2.0

OS: Redhat     lib: glibc3.0

OS: Redhat     lib: glibc2.0

Virtual clusterVirtualized cluster which is composed of virtual machines (VMs). › build a private computational environment that

can be customize for users.› relatively easy to deploy on a single physical

cluster by utilizing cluster building tools.

3Computers at a Site

Local network (LAN)

Virtual machines

(VMs)

OS:Debian      lib: glibc2.0

Virtual machines

(VMs)

OS: Redhat     lib: glibc3.0

Virtual local network

lib: glibc3.0 lib: glibc2.0

Rocks Developed by UCSD

Rocks is installed on clusters at Sites in PRAGMA test-bed.

Rocks virtual cluster :1. A virtual cluster is allocated a VLAN ID and network2. Virtual compute nodes are automatically installed

via network boot technology (PXE boot)

Frontend node

Rocks

Compute nodes

VLAN 2

VLAN 2

Physical NIC

Physical NIC

VLAN 2

VLANConstructi

on

Physical NIC

Physical NIC

VirtualFrontend

nodeeth0eth1

Virtual Compute

nodeeth0

WANLayer 2 communication is needed

( LAN)

Virtual Compute

nodeeth0PXE

booting

Issue : It is difficult to build a virtual cluster over multiple clusters at Grid site with Rocks.

Our Goal

Our Approach› Focus on Rocks› Seamlessly integrate N2N overlay network with Rocks

5

Develop a system which can build a virtual cluster over multiple clusters at Grid sites for computational intensive

applications.

Site A Site BPhysical Network

Rocks cluster A Rocks cluster B

N2N Overlay NetworkRocks virtual cluster

N2N : Overlay network technology Developed by ntop project in Italy

1. Creates an encrypted layer 2 overlay network using P2P protocol.

2. Can establishes layer 2 network spanned on multiple sites.› Utilize TAP virtual network interface (VNIC)

3. Divides overlay networks in similar manner to VLAN ID› Community name

6

Site A Site B

PhysicalNIC

N2NVNIC

Physical NIC

N2NVNIC

LAN LANWANN2N Overlay network

Community name ( network ID)

MAC address

13:14:15:16:18:26 11:22:33:44:55:66

MAC address

Virtual cluster construction (1/3) MVC Controller (MVC : Multi-site Virtual Cluster)

Rocks

Databese

Physical NIC

Physical NIC

Physical NIC

Rocks

MVCDatabese

Physical NIC

Physical NIC

Physical NIC

WAN

LAN LAN

Frontendnode

Compute nodesCompute nodes

Databese

Overlay network Constructor

Resource ManagerResource Manager

VM ManagerRegisters multiple Rocks cluster as resources for a virtual cluster.

1.

2.

3.

rocks add mvc Site A:Site b

Site A Site B

Site A Site B

Virtual cluster construction (2/3)

Rocks

Physical NIC

Physical NIC

Physical NIC

Rocks

Physical NIC

Physical NIC

Physical NIC

WAN

LAN LAN

Frontendnode

Compute nodesCompute nodes

N2NVNIC

N2NVNIC

N2NVNIC

Resource Manager

N2N Overlay network

Cluster name ( Cluster ID)

Builds a Layer 2 overlay network for each virtual cluster.

MVCDatabese

Overlay network ConstructorOverlay network Constructor

VM Manager

1.

2.

3.

MVC Controller (MVC : Multi-site Virtual Cluster)

Site A Site B

Virtual cluster construction (2/3)

Rocks

Physical NIC

Physical NIC

Physical NIC

Rocks

Physical NIC

Physical NIC

Physical NIC

WAN

LAN LAN

Frontendnode

Compute nodesCompute nodes

N2NVNIC

N2NVNIC

N2NVNIC

Resource Manager

N2N Overlay network

Cluster name ( Cluster ID)

Builds a Layer 2 overlay network for each virtual cluster.

MVCDatabese

Overlay network ConstructorOverlay network Constructor

VM Manager

1.

2.

3.

N2NVNIC

N2NVNIC

N2NVNIC

MVC Controller (MVC : Multi-site Virtual Cluster)

Site A Site B

Virtual cluster construction (3/3) MVC Controller (MVC : Multi-site Virtual Cluster)

Rocks

Physical NIC

Physical NIC

Physical NIC

Rocks

Physical NIC

Physical NIC

Virtual Compute

nodeeth0

Physical NIC

N2NVNIC

WAN

LAN LAN

N2NVNIC

Frontendnode

Compute nodesCompute nodes

N2N Overlay network

Cluster name ( Cluster ID)

Overlay network Constructor

N2NVNIC

Resource Manager

MVCDatabese

VirtualFrontend

nodeeth0 eth1

WAN

PXEブート

VM ManagerVM Manager

1.

2.

3.

rocks start host vm overlay frontend

rocks start host vm overlay compute nodeA   Site =A

Seamlessly connects virtual Frontend node and virtual Compute nodes to N2N

overlay network.

Site A Site B

Virtual cluster construction (3/3) MVC Controller (MVC : Multi-site Virtual Cluster)

Rocks

Physical NIC

Physical NIC

Physical NIC

Virtual Compute

nodeeth0Rocks

Physical NIC

Physical NIC

Virtual Compute

nodeeth0

Physical NIC

N2NVNIC

WAN

LAN LAN

N2NVNIC

Frontendnode

Compute nodesCompute nodes

N2N Overlay network

Cluster name ( Cluster ID)

Overlay network Constructor

VM Manager

N2NVNIC

Resource Manager

MVCDatabese

VirtualFrontend

nodeeth0 eth1

WAN

PXEブート

1.

2.

3.

rocks start host vm overlay compute nodeB Site =B

Site A Site B

Feather of our virtual cluster solution

Rocks

Physical NIC

Physical NIC

Physical NIC

Virtual Compute

nodeeth0Rocks

Physical NIC

Physical NIC

Virtual Compute

nodeeth0

Physical NIC

N2NVNIC

WAN

LAN LAN

N2NVNIC

Frontendnode

Compute nodesCompute nodes

N2NVNIC

MVCDatabese

VirtualFrontend

nodeeth0 eth1 WA

N

N2N Overlay network

Cluster name ( Cluster ID)Virtual LAN

$ qsub -np $NSLOTS app.sh$ mpirun -np 2 app.mpi

Can use as well as a Rocks virtual cluster at local site.

Environment

13

1. Verify the possibility of building virtual cluster over multiple Rocks clusters.

2. Evaluate calculation performance for a computational intensive application.

WAN emulator

Frontend node of Rocks cluster B

Frontend node of Rocks cluster A

Switch(1Gbps)

4 compute nodes of cluster

A

4 compute nodes of cluster

B

OS: CentOS 5.4(Rocks 5.4)CPU: Intel Xeon 2.27G HZ * 2 (16core)Memory: 12GBNetwork: 1Gbps4 of compute nodes:

……

Experiment

Site A Site B

0ms 20ms 60ms 100ms 140ms0

5001000150020002500300035004000

513 578 603 626 648

179 787

20252473 2873

Install and configuration time package download time

Inst

all t

ime

for

virt

ual c

ompu

te n

odes

(s

)

The latency at WAN emulator (ms)

1. Verified a virtual cluster over cluster A and B can be built

through N2N overlay network2. Verified possibility of building a virtual cluster in WAN

environment.› Change the latency at WAN emulator

0ms, 20ms, 60ms, 100ms, 140ms› Calculate install time for 4 of virtual compute nodes

About 1.0GB packages to install

Experiment (Possibility of Building)

6921365

2628

3099

3521

Verified virtual compute nodes can be installed in WAN

A virtual cluster over multiple Rocks clusters can be built even if Rocks clusters are in WAN

environment.

Measure execution time of a computational intensive application.› DOCK 6.2 (sample program)

30 pieces of compounds for a protein divided by 8 processes. There are few communication between 8 processes

› Change the latency and bandwidth at WAN emulator 20ms, 60ms, 100ms, 140ms / 500Mbps, 100Mbps, 30Mbps

15

Experiment (Calculation Performance)

0ms 20ms 60ms 100ms 140ms0

20

40

60

80

100

63 63 64 64 67

Exe

cute

Tim

e (

s)

The latency at WAN emulator (ms)

1000Mbps

500Mbps 100Mbps 30Mbps0

20

40

60

80

100

63 63 63 63

Exe

cute

Tim

e (

s)

The bandwidth at WAN emulator (Mbps)The effect of the performance is small

even if latency is high and bandwidth is narrow

16

• have designed and been prototyping a virtual cluster solution over multiple cluster at Grid sites.

• Integrate N2N with Rocks seamlessly.• Verify the calculation performance for distributed application will be scale even if in WAN. Environment.

Conclusion

1. Manage multiple virtual clusters deployed by multiple users.

2. Make the install time of virtual compute nodes short.• Improve the performance of N2N overlay network.• Set a cache repository per site.

Future work

Conclusion and Future work

Requirements for our Virtual cluster solution

Rocks with Xen Roll N2N

› RPM package installation. Open some port for N2N

› For edge nodes and a supernode Install MVC Controller

› Composed of Some new python scripts› Provide original rocks commands. (we still have been developing.)

17

Thank you for your attention!

Fin18

19