infinitely scalable clusters - grid computing on public cloud - london
TRANSCRIPT
変通 [hen-tsoo] noun1. Resourcefulness – the quality of being able to cope with a difficult situation2. Adaptability – the ability to change (or be changed) to fit changed circumstances3. Agility – the power of moving quickly and easily; nimbleness
INFINITELY SCALABLE CLUSTERSGrid computing on public cloud
WELCOME TO HENTSŪ
October 2016
AGENDA• Grid computing overview• Trusted tools moving into public cloud• Alternative cloud services
October 2016
SOME BACKGROUND
October 2016
TERMINOLOGY• Public Cloud (AWS, Azure,
Google)• Private Cloud (Your
datacentre)• High Performance Computing
(HPC)• Grid computing• Compute cluster• Mathworks MATLAB
• CPUs / Processors / Cores• RAM (processor storage)• Disk (physical storage)• IaaS (virtual hardware and
networking)• PaaS (software services)
October 2016
WHAT IS PUBLIC CLOUD?“A service provider makes resources, such as virtual machines, applications and storage, available to the general public.”• Utility model• No contracts• Shared hardware / multi tenant• Self managed
October 2016
WHAT IS GRID COMPUTING?Traditional resource limitations:• Data store performance • PC Processor / Memory / Storage• Network bandwidthThe researcher may wait a long time for results.
• Grid computing moves the computational work from the PC to a cluster of servers
• The cluster processes the data on behalf of the researcher and returns the results
• Processing time is reduced• Larger datasets can be tackled
October 2016
KEY CONCEPTSThe challenges The workflows
Number of tasks
Size
of
data
Big Data
High Throughput Computing
MapReduce
High Performance Computing
Ingest Process
Analyse
Visualise
Store
October 2016
CHOICE OF TOOLS AND PLATFORMS
October 2016
TRUSTED TOOLS & PUBLIC CLOUD
October 2016
HARDWARE INFLEXIBILITY• Buy 22 core processors at
2.2GHz or 6 core processors at 3.6GHz?
• Buy 8GB, 16GB or 32GB memory modules (RAM per core ratio)?
• Graphical Processing Units (GPUs)?
• How much local storage per server?
• What network devices between servers (32 or 48 port switches?)
• What size file server?
Monday Tuesday Wednesday Thursday Friday Saturday Sunday0
20
40
60
80
100
120
Date
Jobs
per
day
Grid usage varies depending on research priorities:
October 2016
PROFILING MATLAB RESOURCE USAGE• MATLAB uses one processor
core at a time (50% on a 2 vCPU machine). Use parallel computing toolkit for multicore PCs.
• MATLAB stores all data in RAM, very little I/O while processing
• I/O spike when writing out results
SysInternals Process ExplorerOctober 2016
MATLAB GRID WITH PUBLIC CLOUD- Pay only for what you
use- Scale compute resource
up AND down- Minimal capital outlay
on hardware- Experiment with grid
computing platforms quickly, cheaply and with no commitment
October 2016
A DAY IN A PUBLIC CLOUD CLUSTER
Time 02:00:0004:10:0006:20:0008:30:0010:40:0012:50:0015:00:0017:10:0019:20:0021:30:0023:40:000
20
40
60
80
100
120
140
160
180
Workers Tasks in Queue
- Cluster consisting 32x 4 cores
- Max 128 worker nodes- Ramps up as jobs get
submitted- Tears down nodes when
jobs finished- Minimising costs when not
in use
October 2016
IDEAL CLUSTER SIZE?
8 16 32 64 96 128 160 192 2240
200
400
600
800
1000
1200
1400
Job Run time in seconds
Cores
Seco
nds
Ingest Process
Analyse
Visualise
Store
Optimise other parts of the workflow?
October 2016
RUNNING MATLAB CLUSTER ON IAASAWS vCPUs are hyper-threaded™
Each vCPU is a hyper thread of an Intel Xeon core for 2nd generation instance types(M4, M3, C4, C3, R3, HS1, G2, I2, and D2)https://aws.amazon.com/ec2/instance-types/
Azure does not overcommit memory or cores. vCPUs are physical cores.Azure does not use hyper-threading.https://aws.amazon.com/ec2/instance-types/
October 2016
GRID DEPLOYMENT OPTIONS1. Infrastructure as a Service (IaaS) DIY
Spin up a compute cluster on VMs for additional capacity and new workloads
2. BurstUse existing on premises compute cluster and burst on cloud as required
3. Software as a Service (SaaS)Software vendors and Managed Service Providers provide their own SaaS solutions. Pay for compute and application software per hour
4. Platform as a Service (PaaS)Cloud providers’ data analytics platform as a service:Google BigQuery & Datalab, Microsoft HDInsight, Amazon EMR
October 2016
CLOUD HOSTED DATA AND ANALYTICS AS A SERVICE
October 2016
GOOGLE BIG DATA REFERENCE ARCHITECTURE
October 2016
WHAT IS BIGQUERY?Hadoop based “service that enables interactive analysis of massively large datasets”• Distributed File System -
Stores data that’s larger than can fit on a single machine
• Map Reduce – Distributes processing across multiple systems
http://blogs.forrester.com/mike_gualtieri/13-06-07-what_is_hadoop October 2016
GOOGLE BIGQUERY AND DATALAB DEMO
October 2016
DON’T FORGET SECURITYSecurity considerations:• Secure transfer and storage of data and code• Secure remote access to cloud hosted environment• Secure authentication
• Windows AD credentials• AWS IAM credentials• Google accounts• Microsoft accounts
• Auditing (who accessed what, who changed what)
October 2016
SUMMARY• Traditional grid and HPC tools can benefit from moving into cloud• Vast landscape of available tools• Off-the-shelf PaaS offerings• Integrations and ecosystems• Cheap and very quick to experiment
October 2016
NEXT EVENT: JANUARY 2017Intellectual Property (IP) security for Public Cloud ServicesSecuring mobile email and cloud based file storage
October 2016