clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...lessons learned vm’s...
TRANSCRIPT
![Page 1: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/1.jpg)
CloudsatothersitesT2-typecomputing
RandallSobie
UniversityofVictoria
RandallSobieIPP/Victoria 1
![Page 2: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/2.jpg)
Overview
• CloudsareusedinavarietyofwaysforTier-2typecomputing
– MCsimulation,productionandanalysis
– Commercial/private,in-house/distributed
• Motivationforusingclouds
– Easeofuse,reducedmanpowercosts,resourcesharing
– Separationofapplicationandsystemadministration
– Leveragesoftwaredevelopmentbycommercialworld
• Howarecloudsbeingused?
– VMprovisioning,jobmanagement,benchmarks,storage,networking,monitoring
RandallSobieIPP/Victoria 2
![Page 3: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/3.jpg)
CloudcomputinginHEP
RandallSobieIPP/Victoria 3
Opportunistic
DedicatedVirtual
cluster
CloudcomputinginHEPistypicallyproviding5-20%oftheprocessingofcurrentprojects
“Dedicated”clouds(OwnedbyHEP)
“Opportunistic”clouds(privateandcommercial)
![Page 4: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/4.jpg)
Clouddeployments
RandallSobieIPP/Victoria 4
Traditionalbare-metal
Specificpurposecloud(e.g..LTDABaBar,HLTclouds)
Standalone/privatecloud(e.g.PNNL,NorduGrid)
Bare-metalorin-housecloudwithexternalcloud(e.g..CERN,BNL)
Distributedclouds(e.g.UK,Canada,Australia,INFNClouds)
![Page 5: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/5.jpg)
Examplesofclouddeployments(meanttoillustrateouruseofclouds)
RandallSobieIPP/Victoria 5
![Page 6: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/6.jpg)
RandallSobieIPP/Victoria 6
Research Cloud
CREAM CE
Dynamic Torque
TORQUE + Maui
TORQUE + Maui
control VMs
distribute jobs via SSH
(Currently 700 cores)
14,000 HEPSpec ~ (1400 cores)
Australia-ATLAs Tier 2
(Belle II) LCG.Melbourne.au
Australian Belle II Grid Site
Dynamic Torque
SingleCREAMCEservicesATLASTier-2(Torque)andBelleIIsite(DynamicTorque)
![Page 7: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/7.jpg)
RandallSobieIPP/Victoria 7
Why private cloud?
Chosen for flexibility, efficient use of compute resources for services Provides easy load-balancing and availability features Provides templating features Easy re-use of templates to test and instantiate new server instances Non-systems staff can provision their own instances of services Software Defined Networking is more malleable than physical
networking, encourages better networking practices, including security
Lessons learned
VM’s and/or containers provide needed flexibility to support multiple collaborations and different user needs Ceph storage is very robust and flexible
VM’s impose a 15%-20% performance penalty on HEP compute workload without careful tuning Move to containers on bare metal planned
OpenStack features do not help us make sure a certain number of instances are up and healthy and consistent
Kubernetes looks appealing in this respect
![Page 8: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/8.jpg)
RandallSobieIPP/Victoria 8
GridPP(P.Love/A.McNab)UniversityOpenstackinstances• CloudsatHEPinstitutions(Oxford/Imperial).• ECDFcloudinEdinburghhasrecentlymadeavailabletotheHEPUKVacuumdeployments• Keytoourlight-weightTier-2strategywhereweoperatewithminimal
manpoweratthesite(<1000cores).
DatacentredcommercialOpenstack• ScaleofaTier-2facility.• Freeaccesstothetheirsystem(ATLAS)whilsttheywerecommissioningthings;
paidforaccesswhenfundsavailable.• NetworkconnectivitytotheUKacademicnetworkisonly1Gbitbuttheyhave
planstoupgrade
![Page 9: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/9.jpg)
RandallSobieIPP/Victoria 9
Italy(INFN;MassimoSgaravattoetal)PrivateOpenStackCloud(Padova-Legnaro)calledCLOUDAREAPADOVANAUsedby~25usergroups/projectthatfinanciallycontributedfortheresourcesBatchprocessing• Relyingontheelastiqframework,HTCondorbatchclustersareinstantiated.
• Thesebatchclustersare'dynamic':newworkernodesareautomaticallyaddedorareremoveddependingonload.
• CMSCloudprojectisintegratedwiththelocalTier-2.• E.g.CMSVMscanaccesstheT2storage(dcache)usingthesamelocal
protocol(dCAP)usedbytheT2WNs.
• PlanstodeploytheSynergyservice,whichallowstomanagetheresourceallocationusingafair-shareapproach,withoutastaticpartitioningofsuchresourcesamongtherelevantusercommunities.
![Page 10: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/10.jpg)
RandallSobieIPP/Victoria 10
NorduGrid
![Page 11: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/11.jpg)
RandallSobieIPP/Victoria 11
BernSwitzerlandSWITCHengines–SwissNRENcommercialcloud(OpenStack)(freeduringdevelopmentphase)
![Page 12: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/12.jpg)
RandallSobieIPP/Victoria 12
CloudSchedulerCloudScheduler
HTCondor
Job
VM
ComputeCloud
VMImage
Repository
Starts job
Start VMs
Submit user script
CanadaDistributedcloudsystemforATLASandBelleII• IntegratedintoPanda/DIRAC• Inproductionfor3-4years• AlsousedbyCanadianastronomy
• uCernVM,CVMFS,Squid-discovery(Shoal)• DistributedVMimagerepository• Datawrittentolocalstorageandtransferred• BenchmarksrunatVMboot• VMtimemeasurementsforaccounting• Reasonablemonitoring
• UpdatingsystemforOpenNebula• Studyingdatafederations(e.g.Dynafed)• Context-awareness
• Challengesincludemanagingresourcesacrossmanyadministrativedomains
10-15cloudsmanagedbyHTCondor/CloudScheduler(4000-5000cores)800-1000cores(each)EC2/Azure(Egressfeeswaived)
![Page 13: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/13.jpg)
RandallSobieIPP/Victoria 13
Cloudresources10clouds4300cores
CanadianWLCG“cloud”–includesAustralianT2FridayOctober6
![Page 14: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/14.jpg)
Jobscheduling/VMprovisioning
• VarietyofmethodsforrunningHEPworkloadsonclouds
– VM-DIRAC(LHCbandBelleII)
– VAC/Vcycle(UK)
– HTCondor/CloudScheduler(Canada)
– HTC/GlideinWMS(FNAL),HTC/VM(PNNL),HTC/APR(BNL)
– Dynamic-Torque(Australia)
– CloudAreaPadovana(INFN)
– ARC(NorduGrid)
• Eachmethodhasitsownmeritsandoftenwasdesignedtointegrated
cloudsintoanexistinginfrastructure(e.g.local,WLCGandexperiment)
RandallSobieIPP/Victoria 14
![Page 15: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/15.jpg)
Commercialandprivateclouds
• Commercialclouduse
– PrimarilyAmazonEC2andMicrosoftAzure(withgrants)
– ATLASdiscussinguseofGCE
– OthercommercialOpenStackclouds
• DataCentred(UK),SWITCHengines(Switzerland)
– CERNcommercialcloudprocurement
• Privateclouds
– OpenStackandOpenNebularesearch-fundedcloudsbutnotinvolvedinHEP
RandallSobieIPP/Victoria 15
![Page 16: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/16.jpg)
Networkconnectivity
• AmazonandMicrosoftcloudsareconnectedtotheresearchnetworksin
NorthAmerica(probablyGCEaswell)
– Egresschargescanbewaiveduponrequest
• Trans-borderortrans-oceantrafficcanbeanissue
– BecomeanimportantdiscussiontopicintheLHCONEmeetings
• Privateopportunisticclouds
– trafficflowsoverresearchnetworkbutnotLHCONEnetwork
RandallSobieIPP/Victoria 16
![Page 17: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/17.jpg)
RandallSobieIPP/Victoria 17
CPUBenchmarks
Newsuiteof“fast”benchmarks
– HEPiXBenchmarkWorkingGroup
– Suiteavailableincludes“fastHS”(LHCb)andWhetstonebenchmarks
• WritetoElasticSearchDB
– RunbenchmarksinthepilotjoborduringthebootoftheVM
Datastorage
– DatawrittentolocalstorageonnodeandthentransferredtoselectedSE
– UKgrouphasdonesomeworkintegratingtheirobjectstorewithATLAS
– BNLusingS3storageonEC2forT2-SE
![Page 18: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/18.jpg)
Monitoring
RandallSobieIPP/Victoria 18
CloudSystemmonitorSensu,Munin,RabbitMQ,Mongo-DB,Ganglia
ApplicationmonitorPandamonitoring
Application
BenchmarksandaccountingElasticSearchDB
Cloudorsitemonitor
![Page 19: Clouds at other sitesheprcdocs.phys.uvic.ca/presentations/wlcg-chep-sobie...Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and](https://reader035.vdocuments.us/reader035/viewer/2022063017/5fda035c01681e38a418d3bc/html5/thumbnails/19.jpg)
Summary
• CloudsatHEPsites
– Typicallyintegratedintoanexistinginfrastructure
– Seenasawaytobettermanagemulti-userresources
– CloudR&Dfundingopportunities
• Opportunisticresearchclouds
– Easywaytoutilizecloudsatnon-HEPresearchcomputingfacilities
– Norequirementforon-siteapplicationspecialistsorcomplexsoftware
• Commercialclouds
– EC2/Azure/GCEdominatebutotherOpenStackclouds
– Grantandsomecontractedresources
– Trans-bordernetworkconnectivitybeingaddressed
RandallSobieIPP/Victoria 19