transcloud: design considerations for a high-performance ...nv/rick_mcgeer-sept_10_2011.pdf–...

28
TRANSCLOUD: Design Considerations for a High Performance Cloud Architecture Across High-Performance Cloud Architecture Across Multiple Administrative Domains Rick McGeer, HP Labs For the TransCloud Team: HP Labs, UC San Diego, University of Victoria, Northwestern University, University University of Victoria, Northwestern University, University of Amsterdam, TU-Kaiserslautern, Princeton University, PlanetWorks, PlanetLab, GENI, G-Lab, DFN, NLR, GLIF Sponsored by the National Science Foundation August 1, 2010

Upload: others

Post on 17-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TRANSCLOUD: Design Considerations for a High Performance Cloud Architecture AcrossHigh-Performance Cloud Architecture Across

Multiple Administrative Domains Rick McGeer, HP Labs

For the TransCloud Team: HP Labs, UC San Diego, University of Victoria, Northwestern University, UniversityUniversity of Victoria, Northwestern University, University

of Amsterdam, TU-Kaiserslautern, Princeton University, PlanetWorks, PlanetLab, GENI, G-Lab, DFN, NLR, GLIF

Sponsored by the National Science Foundation

August 1, 2010

Page 2: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Introduction – TransCloud

• TransCloud: A Cloud Where Services MigrateTransCloud: A Cloud Where Services Migrate, Anytime, Anywhere In a World Where Distance Is EliminatedEliminated– Joint Project Between GENICloud, iGENI, G-Lab

– GENICloud Provides Seamless Interoperation of Cloud Resources Across N-Sites, N-Administrative Domains

– iGENI Optimizes Private Networks of Intelligent Devices

– G-Lab contributes networking and advanced cloud resources

Sponsored by the National Science Foundation 2November 3, 2010

Page 3: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Context 1: Seamless Computation Services Available Anytime AnywhereAvailable Anytime, Anywhere

• “The Cloud” offers the prospect of ubiquitous information and services BUTand services…BUT…– Performance of Cloud services Highly Dependent On Location

• Of End-User, Applications, Middle Processes, Network Topology• Of Cloud Data, Compute Processes, Storage, etc

• Why?– Performance of Legacy Protocols Highly Dependent on Latency

• Therefore:– If the Clouds Are Too Far Away, Performance Will Be Very

Severely Restricted

Ergo• Ergo– Clouds Needs To Be Close To Service Sites OR– Networks (And Clouds) Must Be Designed To Eliminate Distance

Sponsored by the National Science Foundation 3November 3, 2010

Networks (And Clouds) Must Be Designed To Eliminate Distance

Page 4: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Context 2: Living With Legacy Protocols Over Commodity Internet vs Creating AlternativesCommodity Internet vs Creating Alternatives

Legacy Is There For a Reason• Legacy Is There For a Reason– Compatibility– FairnessFairness– Congestion Avoidance

• Therefore: Distributed Cloud– Minimal Latencies Over Legacy Internet To Anywhere/Everywhere

• Therefore: Private Internal Networks– Eliminate Latency Dependence Internally– Use Aggressive Internal Transport/Application Protocols

TIA 1039 R li bl Bl t UDP L bd RAM• TIA-1039, Reliable Blast UDP, Lambda RAM• Flow Control Enabled

Sponsored by the National Science Foundation 4November 3, 2010

Page 5: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Context 3: No Cloud Lives Everywhere

• Clusters are much easier to build than points-of-presence• Clusters are much easier to build than points-of-presence• Most commercial clouds today have only a few sites• Therefore: cloud service providers want to run services• Therefore: cloud service providers want to run services

across multiple clouds– Need a cloud standard that offers identical interfaces over multiple p

domains

• Inspiration: the web– Standard protocol for sending documents– Standard document format

Permission and access control on a site by site page by page– Permission and access control on a site-by-site, page-by-page basis

Sponsored by the National Science Foundation 5November 3, 2010

Page 6: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Context 4: General Considerations

• Major Cloud Use Case: Big Data, Distributed Collection, Must Live With Available Networks– Smart Cities– Sensor Nets

• Best Case: Create Private Network• Best Case: Create Private Network– Owning Optical Fiber– Create High Performance Wireless Point-to-Point Links

• Many Data Intensive Science Projects, Including – High Energy Physics (e.g. LHCNet, Science Data Network, I-

WIRE)WIRE)– Atmospheric Sensing Apparatus– Ocean Observing (e.g., Project Neptune)g ( g , j p )– Distributed Radio and Optical Telescopes– Telemedicine

Sponsored by the National Science Foundation 6November 3, 2010

Page 7: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Premise: Compute Where Data Lives!

• Computation is Ubiquitous and Easy To Obtain• Computation is Ubiquitous and Easy To Obtain• Programs Are Small and Easy to Transmit• Most Programs Reduce Data• Often Data Is Large and Challenging To Transmitg g g

– E.g., Jim Gray distributing SDSS by sending computers by FedEx!

• Solution -- Send Programs to Data• RequiresRequires

– High-performance, low-latency networkCommon API’s and operating environments

Sponsored by the National Science Foundation 7November 3, 2010

– Common API s and operating environments– Lightweight, user-based federation

Page 8: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

What do we need to make this work?

• Advanced Networking and Caching• Advanced Networking and Caching– Firm guarantees on bandwidth and latency on a per-application

basis– Application support at Layer 3 and Layer 2– Means: Private Network where possible

A t l tf h d t li• Access to platforms wherever data lives– But data lives everywhere!

No organization has Points of Presence (PoP)s everywhere– No organization has Points of Presence (PoP)s everywhere– Need for an individual to be able to make arrangements with an

cloud service provider, anywhere, efficiently, minimal overhead– Common form of identity– Common identity not required

C AUP t i d

Sponsored by the National Science Foundation 8November 3, 2010

– Common AUP not required

Page 9: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

What do we need to make this work?

• Ability to instantiate and run a program anywhere• Ability to instantiate and run a program anywhere– Common API at each level of the stack

IaaS/NaaS (VM/VN Creation)– IaaS/NaaS (VM/VN Creation)– PaaS (guaranteed OS/Progamming environment)

O S (St d d Q /D t M t API)– OaaS (Standard Query/Data Management API)• Easy, Standard Naming Scheme

– I need to know the name of my VM’s, logins, store etcwithout asking

Sponsored by the National Science Foundation 9November 3, 2010

Page 10: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Solution – TransCloud

• Introducing TransCloud PrototypeA E l I t ti ti f th A hit t– An Early Instantiation of the Architecture

– A Distributed Environment That Enables Component d I bili E l iand Interoperability Evaluation

– A Testbed On Which Early Experimental Research Can Be Conducted

– An Environment That Can Be Used To Explain/Showcase New Innovative Architecture/Concepts Through Demonstrations

Sponsored by the National Science Foundation 10November 3, 2010

Page 11: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Today

Approx 40 nodes at 4 sites, 10 Gb/s connectivity

Sponsored by the National Science Foundation 11November 3, 2010

connectivity

Page 12: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Today

• Sites at• Sites at– HP Labs, Palo Alto

UC San Diego– UC San Diego– Northwestern

K i l t– Kaiserslautern• Tomorrow (literally!)

– Amsterdam• Connectivity provided by:

– CAVEWave, StarLight, NetherLight, DFN, National Lambda Rail, Global Lambda Integrated Facility

Sponsored by the National Science Foundation 12November 3, 2010

Page 13: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

DemoDemo( d b Ch i P d Ch i M tth(code by Chris Pearson and Chris Matthews,

University of Victoria, data store from Paul Muller (Kaiserslautern) and Michael Zink(U

Mass))ass))

Sponsored by the National Science Foundation 13November 3, 2010

Page 14: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Demonstration

• Multi site query example• Multi-site query example– Internet data repository (packet traces)

• Kaiserslautern Germany (thanks to Paul Muller)• Kaiserslautern, Germany (thanks to Paul Muller)• UC San Diego (thanks to Michael Zink)

– Run an analysis job at each siteRun an analysis job at each site– Transmit the results back to HP Labs– Run summary job at HPLRun summary job at HPL

• What’s being demonstrated?Ability to run multi site job– Ability to run multi-site job

– Sending programs to dataP t t f l i f i ld f

Sponsored by the National Science Foundation 14November 3, 2010

– Prototype of analysis of coming world of sensors

Page 15: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Demonstration

Reduction Job 2Reduction

Merge JobReductio

nReduction

Reduction Result

Final

Reduction

n program

n programResult

Reduction Job 1Reduction

Result

Sponsored by the National Science Foundation 15November 3, 2010

Page 16: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Distributed Query

Sponsored by the National Science Foundation 16November 3, 2010

Page 17: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Introduction – TransCloud

• Several Basic TransCloud Concepts– High Performance Highly Distributed Cloud Architecture

Allowing Processes Across Multiple Administrative Domains Integrated With Dynamic Networking (GENI)Domains Integrated With Dynamic Networking (GENI)

– Scalable Lightweight Federation Processes

– Services Are Based On Processes That Can Be Executed Anywhere World-Wide (Location I d d t)Independent)

– Top Level Services Can Be Accessed Via Public IInternet

– Core Processes and Data Streams Leverage

Sponsored by the National Science Foundation 17November 3, 2010

Sophisticated Communication Services Not Merely “Best Effort” Commodity Internet

Page 18: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Distributed Query Demo

Sponsored by the National Science Foundation 18November 3, 2010

Page 19: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Introduction – TransCloud

• TransCloud Architectural ComponentsTransCloud Architectural Components– High Level APIs

– A High Performance General Programming Environment

– A Wide Area Programming Environment Integrated With Query Systems Resource Management Frameworks, Including Cluster VM and Network ResourceIncluding Cluster, VM and Network Resource Management

High Levels of Virtualization Based on VMs and– High Levels of Virtualization Based on VMs and Network Abstractions

Sponsored by the National Science Foundation 19November 3, 2010

Page 20: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Equals

• IaaS Based on Slice-Based Federation Architecture• IaaS Based on Slice-Based Federation Architecture (GENI/FIRE Standard)– Current instantiation: MyPLC over Eucalyptusy yp– Want: ports to OpenStack, etc.

• Identity: X.509 certificates and ssh keys– TransCloud sites agree to accept these as forms of identity– Which to accept up to the site

St d d DNS I f t t• Standard DNS Infrastructure<instanceName>.<sliceName>.<siteName>.<authorityName>.trans-cloud net: experiment interfacecloud.net: experiment interface

e.g.hadoop22.queryTest.hplabs.genicloud.trans-cloud.net

<siteName>.<authorityName>.trans-cloud.org: admin interfaceh l b i l d t l d

Sponsored by the National Science Foundation 20November 3, 2010

hplabs.genicloud.trans-cloud.org

Each authority does its own DNS.

Page 21: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Equals..

• Experimental QaaS (Distributed Hadoop/Pig)• Experimental QaaS (Distributed Hadoop/Pig)• User-done PaaS (some stock images, but the

l t l f b ildi )usual tools for building your own…)

Sponsored by the National Science Foundation 21November 3, 2010

Page 22: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Integration with GENI

• Programmer and User Interface to Cluster Control• Programmer and User Interface to Cluster Control is MyPLC

Cluster version of PlanetLab control interface– Cluster version of PlanetLab control interface– Used for a number of clusters worldwide, including VICI

project in USproject in US• Mechanics of cluster control done by Eucalyptus

Si l E l t M PLC– Single Eucalyptus user – MyPLC– Users log in to MyPLC, issue directives, MyPLC

effectuates by issuing appropriate Eucalyptuseffectuates by issuing appropriate Eucalyptus commands

Sponsored by the National Science Foundation 22November 3, 2010

Page 23: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Architecture

Di t ib t d PiDistributed Pig

Distributed Hadoop

NaClRePy

p

Slice FederationGENI Eucalyptus

Fl P i iti

1039/RBUDP…Slice Federation

Architecture Flow Primitives

Sponsored by the National Science Foundation 23November 3, 2010

Page 24: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud Distributed Query

Sponsored by the National Science Foundation 24November 3, 2010

Page 25: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Getting Hacked!

• On April 15 (about) we were attacked by the Romanian Black HatsOn April 15 (about) we were attacked by the Romanian Black Hats– Stock VM had a privileged user with a guessable password– Came with the VM…

A k k i b f b– Attack was a worm attack to recruit bots for botnets– We were alerted when a third-party site saw worm probes coming from us

• Solution: shut it down, fix it, bring it up, , g p• The Fix:

– Use MyPLC (PlanetLab) as the controllerL i l b h k X 509 t (GENI t d d)– Login only by ssh key, X.509 cert (GENI standard)

– Ssh login only from specified IP addresses (EC-2 standard)– Authorized users can add whitelisted IP’s– Currently enforced by iptables, but we’ll add support into OpenFlow

• Running final pre re-launch tests now

Sponsored by the National Science Foundation 25November 3, 2010

Page 26: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Goals for 2011

• Complete integration with MyPLC• Complete integration with MyPLC• Integrate the ProtoGENI Resource Specification

(R )(Rspec)– Modified to make sense for clusters

• Integrate the GENI standard Authorization-Based Access Control (ABAC)

• Add utility to permit users to manually adjust connectivity rulesy– Integration with ProtoGENI RSpec

Sponsored by the National Science Foundation 26November 3, 2010

Page 27: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

Advancing TransCloud

• If You Are Interested In Using This• If You Are Interested In Using This Environment, Contact Us

• If You Would Like To Contribute ResourcesIf You Would Like To Contribute Resources, Contact Us

Sponsored by the National Science Foundation 27November 3, 2010

Page 28: TRANSCLOUD: Design Considerations for a High-Performance ...nv/Rick_McGeer-Sept_10_2011.pdf– Standard protocol for sending documents – Standard document format – Permission and

TransCloud at NICT

• THANKS!THANKS!

• Questions????

Sponsored by the National Science Foundation 28November 3, 2010