research in an open cloud exchange - systor · pdf fileresearch in an open cloud exchange....

65
Research in an Open Cloud Exchange

Upload: phungbao

Post on 19-Mar-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Research in an Open Cloud Exchange

CLOUD COMPUTING IS HAVING A DRAMATICIMPACT

• On-demand access

• Economies of scale

All compute/storage will move to the cloud?

Today’s IaaS clouds

• One company responsible for implementing and operating the cloud

• Typically highly secretive about operational practices

• Exposes limited information to enable optimizations

What’stheproblem

• LotsofinnovationabovetheIaaSlevel…but• considerEnterpriseDB,orAkamai

• Lotsofdifferentproviders…but• bandwidthbetweenproviderslimited• offeringsincompatible;switchingaproblem• pricechallengestomoving

• Novisibility/auditinginternalprocesses• Priceisterribleforcomputersrun24x7x365

Morechallenges

• Providerincentivenotalignedwithefficientmarketplace:• stickinessinprice,indifferentiation• advantageotherservices• homogeneityforefficiency

• Hardforlargeprovidertoefficientlysupportnichemarkets,radicallydifferenteconomicmodels…

• Nicheprovidersprobablycan’tsupportrichecosystem

We are in the equivalent of the pre-Internet world, where AOL and CompuServe dominated on-

line access

Is a different model possible?An “Open Cloud eXchange (OCX)”

C3DDB

HPCBig Data

Web

BIG BOX STORE SHOPPING MALL

CATHEDRAL BAZAAR

Whyisthisimportant• Anyonecanaddanewserviceandcompeteinalevelplayingfield• Historytellsustheopeninguptorichcommunity/marketplacecompetitionresultsininnovation/efficiency:• “TheCathedralandtheBazaar”byEricStevenRaymond• “TheMasterSwitch:TheRiseandFallofInformationEmpires”byTimWu

• Thiscouldfundamentallychangesystemsresearch:• accesstorealdata• accesstorealusers• accesstoscale

Withoutthat…solvingthesphericalhorseproblem…

Thisisn’tcrazy…really

• Currentcloudsareincrediblyexpensive…• Muchofindustrylockedoutofcurrentclouds• lotsofgreatopensourcesoftware• lotsofgreatnichemarkets;marketsimportanttous…• lotsofusersconcernedbyvendorlockin…• thisdoesn’tneedtobeAWSscaletobeworthit

• “Pastacertainscale;littleadvantagetoeconomyofscale”— JohnGoodhue

TheMassachusettsOpenCloud

MGHPCC

15MW,90,000squarefeet+cangrow

THE MASSACHUSETTS COLLABORATORS

OperatingSystems,Power,Security,Marketplace…

CloudTechnology

UniversityResearchITPartnersBU,HU,NU,UMass,MIT,MGHPCC

PartnersBrocade,CISCO,Intel,Lenovo,RedHat,TwoSigma,USAF,Dell,Fujitsu,Mellanox,CambridgeComputer…

Users/applicationsBigData,HPC,LifeSciences,…

CoreTeam&StudentsOCXmodel,HIL,Billing,Intermediaries…

DataBU,HU,NU,MIT,UMass,Foundations,Govt…

EducationandWorkforceStudents,industry

15

MOC Ecosystem

HOW DO WE START?

Keystone

Neutron GlanceNova Cinder

Keystone

OPENSTACK FOR AN OCX

• OpenStack is a natural starting point

• Mix & Match federation

Keystone

Neutron GlanceNova Cinder

Mix and Match (Resource Federation)

• Solution• ProxybetweenOpenStackservices

• Statusoftheproject• HostedupstreambytheOpenStackinfrastructure

• https://github.com/openstack/mixmatch• ProductiondeploymentplannedforQ12017

• Team:• CoreTeam:KristiNikolla,EricJuma,JeremyFreudberg

• Contributors:AdamYoung(RedHat),GeorgeSilvis,Wjdan Alharthi,Minying Lu,KyleLiberti

• Moreinformation:• https://info.massopencloud.org/blog/mixmatch-federation/

BostonUniversity

NortheasternUniversity

mixmatch

NovaKeystone

CinderKeystone

It’sreal…• Availablenow:ProductionOpenStackservices…

• Smallscale,butgrowing(coupleofhundredservers,550TBstorage),200+users

• VMs,on-demandBigData(Hadoop,SPARK...),

• What’scoming:– SimpleGUIforendusers– OpenShift – RedHat– Federationacrossuniversities– Rapid/secureHardwareasaService– 20+PBDataLake– CloudDataverse

• PlatformforenormousrangeofresearchprojectsacrossBU,NEU,MIT&Harvard

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud• SecureHardwareMultiplexing

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud• SecureHardwareMultiplexing

Researchchallenges

• Marketplacemechanisms• HostingDatasets,Mercè Crosas Harvard• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud• SecureHardwareMultiplexing

AWS Public Datasets

“WhendataismadepubliclyavailableonAWS,anyonecananalyzeanyvolumeofdatawithoutneedingtodownloadorstoreitthemselves.”

But, AWS public datasets miss key aspects needed in data repositories

• Incentivestosharedata• Citationtoeachversionofthedata• MetadataforDiscoverability• Tieredaccesstonon-publicdata• Commitmenttodataarchival& preservation

Today’s repositories incentivize data sharing by giving credit to data authors through formal citation

Persistentcitationstodatasetspublishedindatarepositories

Bibliography

The Dataverse open-source platform enables building any type of data repository

Agriculturedata

RepositoryinFudan,China

Datafrom20Universities

Publicdatarepository

ScienceConsortium

Datadepositor Datausers

Problems:• Largedatasets• Lackcomputationalinfrastructure

Datadepositor Datausers

Swift

ObjectStorage

Nova

Compute

Horizon

Datadepositor Datausers

Nova

Compute

Horizon

Nova

Compute

Sahara

Analytics

Swift

ObjectStorage

Datadepositor Datausers

Swift

ObjectStorage

Nova

Compute

Horizon

Nova

Compute

Sahara

Analytics

Giji

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud• SecureHardwareMultiplexing

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage,PeterDesnoyersNU• HPContheCloud• SecureHardwareMultiplexing

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud:ChrisHillMIT• SecureHardwareMultiplexing

Researchchallenges

• Marketplacemechanisms• HostingDatasets• Multi-providercloudlet• Softwaredefinedstorage• HPContheCloud• SecureHardwareMultiplexing:PeterDesnoyers NU,GeneCoopermanNU,NabilSchear MITLL,LarryRudolph&TrammellHudsonTwoSigma,JasonHennesseyBU,…

HPC

Datacenter has isolated silos

35

Hardware isolation layer

AllocatephysicalnodesAllocatenetworksConnectnodesandnetworks36

Hardware Isolation Layer (HIL):CONVERGING HPC, BIG DATA & CLOUD

SLURM,PBS

OpenStack

Custom

OS

(Neu

roDe

bian?)

SLURM,PBS

OpenStack

What about security?

SLURM,PBS

OpenStack

SLURM,PBS

OpenStack

Custom

OS

(Neu

roDe

bian?)

Secure Cloud Project

• Shared project with Two Sigma, MIT LL, USAF, Lenovo, Intel

• Integrating attestation infrastructure & secure FW

How fast can we do this?

Bare Metal Imaging Service

iSCSI-basedAbletoprovision+bootin<5min

Turk,A.,Gudimetla,R.S.,Kaynar,E.U.,Hennessey,J.,Tikale,S.,Desnoyers,P.,&Krieger,O.(2016).AnExperimentonBare-MetalBigData Provisioning.In8thUSENIXWorkshoponHot

TopicsinCloudComputing(HotCloud 16).

39

RapidBare-MetalProvisioningandImageManagement,RavisantoshGudimetlaandApoorve Mohan

Researchchallenges• Canweexposerichinformationaboutserviceswhilenotviolatingcustomerprivacy

• Howcanwecorrelatebetweentheinformationbetweenthedifferentlayers?

• Howcanweidentifysourceoffailures?• HowcanwecreateaNetworkingMarketplace?

Researchchallenges• Canweexposerichinformationaboutserviceswhilenotviolatingcustomerprivacy

• Howcanwecorrelatebetweentheinformationbetweenthedifferentlayers?

• Howcanweidentifysourceoffailures?• Networking Marketplace:RodrigoFonsecaBrown

Common view: Networking is like air conditioning, or power

Part of the infrastructure, provided by the datacenter

Basic Architecture

Jointlyadministeredmachinesw/internalnetwork

GPUs

Storage

Compute

Multi-Provider Inter-Pod Network

EdgeofPodswitch

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…

• Highlyelasticenvironments;e.g.,1000serversforaminute:

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors:MartinHerbordt(BU)

• CachingstoragefromDataLakes• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…

• Highlyelasticenvironments;e.g.,1000serversforaminute:

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes:Desnoyers NU,KriegerBU

• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…

• Highlyelasticenvironments;e.g.,1000serversforaminute:

DataLakeinatypicalDC

NorthEasternStorageExchange(NESE):20+PBHarvard,NEU,MIT,BU,UMass

Simple deployment:• Cache Node per rack • L1 : Rack Local

– reduce inter rack traffic• L2 : Cluster Local

– reduce clusters and back-end storage traffic

• Implemented by modifying CEPH Rados Gateway

Node

Rack1

NodeNodeNode

L1 CACHE

CACHENODE1

Node

Rack2

NodeNodeNode

L1 CACHE

CACHENODE2

Node

RackN

NodeNodeNode

L1 CACHE

CACHENODEN

L2 CACHE

ComputeCluster

Data Lake

DatacenterscaleDataDeliveryNetwork(D3N)

D3NResults

1 2 3 4 5 6 7 8Number of Hadoop Nodes

00.5

11.5

22.5

33.5

44.5

55.5

Ag

gre

gat

e T

hro

ug

hp

ut

(GB

/s)

RGWD3N L1 Hit

Maximum SSD Bandwidth

1 2 3 4 5 6 7 8Number of Curl Nodes

0.51

1.52

2.53

3.54

4.55

Aggre

gat

e T

hro

ughput

(GB

/s)

RGWD3N L1 Hit

Maximum SSD Bandwidth

• Exceeds maximum bandwidth Hadoop

• Demonstrates makes sense to share expensive SSDs –faster than local disk

• With extreme benchmark can saturate SSD & 40 Gb NIC

• Will be of enormous value with NESE data lake

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes:Desnoyers NU,KriegerBU

• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…

• Highlyelasticenvironments;e.g.,1000serversforaminute:

ModularApproachtoCloudSecurity

Insecurity,thesumofthepartsisoftenahole.

– DaveSafford,circa2000

Ourgoalistobuildsecuritysystemssothatthesumofthepartsisaholisticsecurityguarantee.

– RanCanetti,2016

SynergybetweenMACSandMOC

Typesofconnections• People:researcherscancontributetowardbothprojects

– SizeofMACS:13faculty,11postdocs,25+graduatestudents• Techtransition:deployMACStechinMOCmarketplace• Problemcreation:MOC’sproblemsfeedMACSresearch• Funding:jointcloudresearchhasmultipliereffect

ValuethatMOCprovidestoMACS• Access:data,meta-data,scale,problems,andusers• Uniquetrustrelationships:federateddatacenter Hardware

CloudIaaSmanagement

Operatingsystem

Applications&platforms

Algorithms&techniques

MACS→MOC MOC→MACS

InterplaybetweenMACSandMOC

Federated Monitoring

MOCMonitoringInfrastructure

Private MonitoringEbbRT

SecureHardware HI

LBMI Secure

Cloud

Secure Dataverse

UC Analysis

Legend:• Yellow =MACS• Blue =MOC• Green =Joint

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes:Desnoyers NU,KriegerBU

• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities:AzerBestavrosBU• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…

• Highlyelasticenvironments;e.g.,1000serversforaminute:

Example: Smart cities

MOC

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes:Desnoyers NU,KriegerBU

• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…:…:AlinaOpreaNU

• Highlyelasticenvironments;e.g.,1000serversforaminute:

Analytics-baseddefenses

• Goals– Correlatedatasourcesfrommultiplecloudlayers

• Builduser,VMandapplicationprofiles– Machinelearningtechniquestodetectwiderangeofthreats

• Protectionofcloudinfrastructure• Enableclouduserstoprotecttheirresources

– ProvidedatacollectionandanalyticsAPIstousers58

Behavior-basedauthentication

• Detect credential compromise–Developers leak their AWS passwords in GitHub

• Build user profiles based on historical data–Login information (IP address, time)–VM usage (CPU, memory, disk)

• Anomaly detection–Detect unusual activities

59

Suspiciousaccounts

Networktrafficanalysis

60

Usecases• DetectsuspiciouscommunicationwithexternalIPaddresses• Detectdataexfiltrationattempts• Preventcloudabuse

– Malwareinfection,applicationexploits,illegaluseofcloud

sFlowcollecto

r

sFlowcollecto

r

MongoDB

Researchenabled• Newhardwareinfrastructure;e.g.FPGAs,newprocessors

• CachingstoragefromDataLakes:Desnoyers NU,KriegerBU

• Cloudsecurityandcomposabilityofsecurityproperties;e.g.,MACSproject

• Smartcities• Analysisofcloudinternalinformation(logs,metrics)forsecurity,foroptimization…:…:AlinaOpreaNU

• Highlyelasticenvironments;e.g.,1000serversforaminute: JonathanAppavooBU

ExampleSupportingInteractive,BurstyHPCApplications:OSDI2016

EbbRTdistributedlibraryOS[AppavooBU]:• Front-endLinuxallocatesbare-metalback-endnodesondemand

• Back-endnodeslibraryOScustomizedtosingleapplicationneeds

[Appavoo]

B

LinuxFront-End

LibraryOSBack-Ends

BWebInterface

ElasticSoftware

ElasticInfrastructure

BInfrastructureasElasticResourcePool

XSPcomputeservicebasedonKittyhawk[AppavooIBM]• Fastprovisioningbasedonbroadcast• HardwarelevelbasedonHaaS• IaaSlevelbypre-allocatingVMsoutofOpenStack

RequestResponse

Exemplar

0

2250

4500

6750

9000

11250

PC BG1K BG4k BG16k

seco

nds

synthetic1024x1024200 slices

Fetal Image Reconstruction

~2.4hrs

~24s

resized,cropped96x9650 slices

APP

IRTK24hrs

RedHatCollaboratory

• MonitoringandAnalytics• OpenShift ontheMOC• DatacenterscaleDataDeliveryNetwork(D3N)• HIL&QUADS• AcceleratorTestbed• BigDataAnalyticsandCloudDataverse

End-to-endPOC:RadiologyinthecloudtargetingOpenShift withaccelerators

Concludingremarks• MOCafunctioningsmallscalecloudforregiontoday:

–http://info.massopencloud.org• KeydriveristheOCXModel:

–KeyenablersgoingoninOpenStack (beenachallenge)–couldbecomeimportantcomponentofclouds–Majorresearchchallenge &opportunities–Enablingresearchtoco-existswithproduction:

• realdata,realusers,realscale• Getinvolved:useit,internships,exposeresearch• Startreplicatedthemodelelsewhere