1 plush – mesh tree fast and robust wide-area remote execution mikhail afanasyev ‧ jose garcia...

19
1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

1

PLuSH – Mesh Tree

Fast and Robust Wide-Area Remote Execution

Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

Page 2: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

2

Introduction

PlanetLab is an open platform for developing, deploying and accessing planetary-scale services

It allows remote execution on nodes around the world

Extremely useful in developing new network technologies

Introduction ‧ Mesh Tree ‧ Additional

Page 3: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

3

PlanetLab

Current distribution of 534 nodes over 253 sites

Introduction ‧ Mesh Tree ‧ Additional

Page 4: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

4

The Motivation Performance is abysmal

Currently, the user must make an SSH connection to each node

As the number of nodes grows, the overhead of establishing SSH connections becomes more significant

Not all nodes can reach one another directly

Introduction ‧ Mesh Tree ‧ Additional

Page 5: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

5

The Motivation

Flaky Control Controlling large sets of remote

processes is difficult Example: C-c will result in remote

processes being killed or straggler processors to remain

Introduction ‧ Mesh Tree ‧ Additional

Page 6: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

6

PLuSH Suppose that someone wants to test a new

network application and decides to run the code on 100 machines

First, we must be able to determine a list of target nodes to run the experience. Nodes can be picked depending on factors including CPU load, bandwidth, latency, etc.

Next, the code must be deployed The code must be started simultaneously on

all machines. Once the code is running, we must be able

monitor the progress and collect statistics

Introduction ‧ Mesh Tree ‧ Additional

Page 7: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

7

PLuSH One of the core parts of

Plush is the Mesh interface Mesh abstracts the underlying overlay Mesh uses a rough list of hosts to

construct an overlay communication mesh by using the host directory to query host names and authentication information.

Introduction ‧ Mesh Tree ‧ Additional

Page 8: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

8

Weaving the Mesh

1. SSH authentication forwarding

2. Building the tree3. Adding

robustness

Mesh Tree: SSH Forwarding ‧ Building the Tree ‧ Robustness

Page 9: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

9

How SSH Works Agent listens on “agent sock

et” which is a Unix domain socket. Agent has private key.

SSH (on agent’s side) makes a connection from Home PC to SSHd (daemon) on PL1, who has the public key.

SSHd sends challenge to SSH. SSH connects to Agent socket, gives challenge to Agent. Agent uses private key to make response and forwards to SSH, who forwards to SSHd.

Challenge

Response

ChallengeRespon

se

public key

Mesh Tree: SSH Forwarding ‧Building the Tree ‧ Robustness

Page 10: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

10

Our World with PlanetLab

The slice breaks up PL1, so SSH cannot forward to SSHd

Challenge

Response

ChallengeResponse

public key

Challenge

public key

Mesh Tree: SSH Forwarding ‧Building the Tree ‧ Robustness

Page 11: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

11

Response

SSH Mesh: Response

SSH Mesh:Challenge

Our Solution

Challenge

Response

Challenge

Response

public key

Challenge

public key

Challenge

Response

Client’s STDIN and STDOUT are connected to SSHd

Mesh Tree: SSH Forwarding ‧Building the Tree ‧ Robustness

Page 12: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

12

Building the Tree

There is rudimentary support for a tree

We implemented multiple tree-building algorithms

Trees can be built using SSH tree and Macedon

Mesh Tree: SSH Forwarding‧ Building the Tree ‧ Robustness

Page 13: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

13

Macedon vs. SSH MeshMacedon

Advantages Support for many

protocols Does not spend time

decrypting and encrypting

Disadvantages Can be easily hijacked Heavy program

SSH Mesh Advantages

Requires minimum client software

Provides protection against both sniffing and hijacking

Disadvantages Spends time

decrypting and encrypting

Mesh Tree: SSH Forwarding‧ Building the Tree ‧ Robustness

Page 14: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

14

Adding Robustness The Forwarding mechanis

m Forwards SSH connection Allows us to change root so t

hat we can detach the experiment controller

Allows us to recover from failures in the root

Mesh Tree: SSH Forwarding ‧ Building the Tree ‧ Robustness

Page 15: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

15

Comparing the trees

Introduction ‧ Mesh Tree ‧ Additional

0%

20%

40%

60%

80%

100%

0 sec 1 sec 2 sec 3 sec 4 sec 5 sec 6 sec 7 sec 8 sec

Pe

rce

nt

do

ne

Latency

Mesh Latency

SSH-flatSSH-tree

Page 16: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

16

Additional Tools Debugging Tool

Deploys required files to all necessary nodes Opens multiple simultaneous connections

for very high speeds Stops runaway processes

Macedon Testing Tool Shows raw data for underlying Macedon

communication networks Controls Macedon networks from console Uses self-developed Perl Macedon bindings

Introduction ‧ Mesh Tree ‧ Additional

Page 17: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

17

Future Research Work

Comparison between more Mesh overlay algorithms

Evaluate the performance difference between SSH Mesh and Macedon

Introduction ‧ Mesh Tree ‧ Additional

Page 18: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

18

Additional Thanks

We would like to thank the following people for their help Chris Tuttle Jeannie Albrecht Chip Killian

Introduction ‧ Mesh Tree ‧ Additional

Page 19: 1 PLuSH – Mesh Tree Fast and Robust Wide-Area Remote Execution Mikhail Afanasyev ‧ Jose Garcia ‧ Brian Lum

19

Conclusion Remote execution in PlanetLab through b

asic SSH connections is neither scalable nor robust

We have implemented a solution that improves both scalability and robustness

SSH forwarding mechanism Building the tree Adding robustness

Introduction ‧ Mesh Tree ‧ Additional