deliver best-in-class hpc cloud solutions without losing your mind
Post on 14-Feb-2017
209 views
Embed Size (px)
TRANSCRIPT
AWS Deck Template
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindWEBINARApril 13, 2016, 11:00 AM ET
HousekeepingAudio helpAttachmentsQuestionsRating
Todays SpeakersRick FriedmanVice President, Solution DevelopmentCycle ComputingScott JeschonekDirector of Product Management, Cloud Avere Systems
AgendaDiscuss the current state of HPC Clouds and their impact on your HPC worldReasons why you arent 100% cloud-based alreadyThe Hybrid Cloud and HPCPossible implementations Delivering File Systems using Avere SystemsOrchestration using Cycle Computing
Title Slide4
HPC Today (and Yesterday, and Tomorrow)
What Drives Todays NeedsDataWho, what, when, how much, where?Datacenter limitationsCan I defy physics?User expectationsCan we even do that?Technology shiftsWhat is the best practice?
Big Compute Workloads: How are they handled?
Compute Demand vs. Cluster Size
Cluster SizeCompute DemandMissed OpportunityWasted ResourcesInternal infrastructure has huge value and some limitationsAccess, not capacity, is the barrier to continued growth Perception limits scale of problem solving Public cloud = cost-effective, readily available resources to users with problems & deadlines. Financial services, manufacturing and life sciences are leading the way.
Cost-effective internal infrastructure has enabled users to solve increasingly larger problems over the last 15 years while also highlighting some inefficienciesThe barrier to sustaining that growth is an access and allocation problem, not a compute problemBig Compute users are limiting the size of the problems they tackle, to the infrastructure they think they can access. Public cloud represents an opportunity to allocate cost-effective, readily available resources to users who value the ability to solve a problem within a deadline. Financial services, manufacturing and life sciences are leading the way; their problem is most acute and solving it has measureable business benefit
7
Basic HPC Environment RequirementsResource ManagerJobs Manager / Scheduler
WorkloadNAS Storage
Lots of compute resources (Grid)
Advantages of Clouds
Significantly reduce infrastructure management costs both in money and time
Maintain operational flexibility during scale-out jobslet the provider deal with scale challenges
Why the Cloud for Big Compute?Scientist / Engineer User perspectiveZero queue times, capacity in minutesScale compute to problems size, not vice versaTry / support new computational approaches and software quicklySysArchitect perspectiveDynamically adjust workloads to lowest cost/impact providerFocus on computational excellence, not hardware managementSupport a wide range of user types efficientlyOrganizational perspectiveMatch spending to actual consumptionIncrease responsiveness to business dynamicsGrow user base without hardware limitations
Clouds Have Awesome New CapabilitiesBig DataAnalytics ToolsMassively scalable NoSQLData warehousingMachine LearningVoice/Vision/SpeechEarly days
Sowhy isnt everything in the cloud?
Current infrastructure investment (capex)Cloud costs not yet completely in line Software infrastructure in placeCosts to refactor, dependencies to considerData environment in one or more data centersOrchestration and management of cloud clusters is hardNetwork bandwidth / latency concernsBusiness Continuity
Other Reasons Youre Not 100% in the Cloud
Corporate budgetsCorporate policiesCorporate politicsEducation / awarenessGovernment regulationsInterest groupsVendor relationships
Near Future, Hybrid Cloud
Tokyo office
London office
Analysts
AnalystsNYC office
Analysts
Analysts
Analysts
Analysts
Analysts
Analysts
Analysts
Analysts
Hong Kong officeAdoption of one or more cloud providers> 1 hedge on price and SLAMix of on-prem and cloud resourcesRegulatory, proprietary and/or security characteristics will likely keep data in the DCNAS
Primary DC
Cloud Provider 1Cloud Provider 2
NAS
Secondary DCSubmit JobsSubmit Jobs
14
Cloud ComputeEnvironmentDataHPC in the CloudCloud Compute API
Scheduler
NAS Storage
Analysts
Scheduler
Analysts
Analysts
Analysts
Analysts
Analysts
Jobs
On-Premises Data Center
Cloud ComputeEnvironmentHPC in the Cloud, Grids on DemandCloud Compute API
DataNAS Storage
Analysts
Scheduler
Analysts
Analysts
Analysts
Analysts
Analysts
Jobs
On-Premises Data Center
Scheduler1
Scheduler2
Scheduler3
Scheduler4
Challenges with HPC in the Cloud
How do you get the data close to your compute nodes?
How do you orchestrate on-demand clusters/grids of compute nodes?
How does this all come together??
Cloud ComputeEnvironmentData Access LayerCloud Compute API
Scheduler1
DataNAS Storage
Analysts
Scheduler
Analysts
Analysts
Analysts
Analysts
Analysts
Jobs
On-Premises Data CenterData Access Layer
Scheduler2
Scheduler3
Scheduler4File SystemCaching LayerOnly load necessary blocks of filesOpaque to compute nodes
Advantages of Data Access / Cache Layer
Keep your data on prem! Data in cloud is only there while the compute nodes work the jobs. Reduce the security objections, simplify the move to cloud
Increase cloud compute performance using file system caching, most of the data will be in RAM, close to the nodesAvoids ingest latencies and slashes transit latency after first read
Scale out Using solution that facilitates 10s of 1000s of core file system connections
Typical File Access in Hadoop Cluster
Caching files will work for certain types of jobs
Where typical file is accessedBy multiple clientssource: http://blog.cloudera.com/blog/2012/09/what-do-real-life-hadoop-workloads-look-like/
Hybrid Cloud using Avere FXT and vFXT Edge Filers
CloudComputeOn-PremComputeCloudStorageOn-PremStorage
NASObjectBucket 1Bucket 2Bucket nVirtual Compute FarmVirtual FXTFile Storage forPrivate ObjectNAS OptimizationCloud NASCloud BurstingCloud StorageGatewayPhysical FXTThe Edge = locating your dataClose to your computeWithout truly moving it from yourNAS environment
Avere Building BlocksAvere is uniquely positioned to offer scale across tens of thousands of cloud compute cores while leaving the data where it originates, on premises, with its global file system and caching capabilities.
- Unnamed CTO
Cloud ComputeVirtual FXT
NASObject
Physical FXTCloudOn-PremisesFile Acceleration12-20msEncrypted
Cloud ComputeEnvironmentOrchestration and Management LayerCloud Compute API
DataOn-Premises Data Center
Scheduler1
Scheduler2
Scheduler3
Scheduler4
NAS Storage
Analysts
Scheduler
Analysts
Analysts
Analysts
Analysts
Analysts
Jobs
OptimizationBenchmark instancesMake Workflow UI Human workflow
ProvisioningWorkload placement Optimal scaleCost optimizationData scheduling
Cluster ConfigurationMulti-cloud, without changesPre-set or User-defined typesAbstraction for all cluster data, attributes (roles, OS, etc)
MonitoringAuto-scalingUsage trackingError HandlingReporting
Internal
File: DeclarativeCluster DefinitionPackages, InstallersContainers, Data
Admin
User
Complete Multi-Cloud Workflow Control
UserWeb UI APICMDLine Job & Data WorkflowAutomatedJob Placement, Cost optimizationAuto-scaling, Benchmarking, Compliance, Reporting toolsMulti-cloud Without Changes
InternalCluster
How Cycle Makes Cloud ProductiveScientist / Engineer productivity: Simple workflowsZero queue timeAuto-scalingSysAdmin productivity: Instant access to additional resourcesWorkflows linking internal and multiple cloudsSimple reliable tools to enable apps with special requirementsOrganizational productivity:Secure, consistent cloud accessUsage trackingAbility to leverage multiple providers
Big Data w/o Disrupting ProductionChallengeEstimate the carbon stored in Saharan biomassRapidly establish a baseline for later research using large amounts of high-resolution remote sensing dataExisting internal compute resources fully committedLimited window to complete processingCycle solutionFull workflow including data management between internal data capture and cloud processingLeverage spot pricing to minimize cost while maximizing computationResultsLinearly scalable, predictable enabling plan for next stepsScience being done that could not be done otherwise1 month start to initial runs26
Overall Architecture Data In-House27
Cloud ComputeSchedulerAvere FXT Edge FilerAvere FXT Workload
Cloud API
NAS StorageSchedulerCloud Storage
What We CoveredThe Current State of HPC Clouds and Their Impact on Your HPC WorldReasons Why You arent 100% Cloud-based AlreadyThe Hybrid Cloud and HPCPossible Implementations Delivering File Systems Using Avere SystemsOrchestration Using Cycle Computing
Title Slide28
Thank you!
Cycle Computing Contact Info:More about Avere Systems:[email protected]
https://twitter.com/averesystems
https://www.youtube.com/user/AvereSystems
https://www.linkedin.com/company/589037
h