openstack dc meet up june 7th at 6:30pm @geekeasydcfiles.meetup.com/2979972/openstack dc june...
TRANSCRIPT
OpenStack DC Meet Up
June 7th at 6:30pm @warehousedc
www.meetup.com/OpenStackDC
www.twitter.com/OpenStackDC
WELCOME!
Thank you to our Sponsor,
( )!
Meet our OpenStack DC Organizers Haisam Ido
Kapil Thangavelu
Matthew Metheny
Eric Mandel
Jason Ford
Kenna McCabe
Ryan Day
AGENDA
"High-Performance, Heterogeneous Computing and OpenStack" by Karandeep Singh, Cloud Computing and HPC Engineer at University of Southern
California / Information Sciences Institute
"Essex: Architecture and Deployment of Compute Clouds" by Jason Ford, CTO and Co-Founder of BlackMesh
"OpenStack Nova Distributed RPC with Zeromq" by Eric Windisch Senior Systems Engineer at CloudScaling
"OpenStack Bare-Metal Provisioning Framework” by Mikyung Kang at Adaptive Parallel Execution Division at University of Southern
California / Information Sciences Institute
Heterogeneous, High-Performance Cloud Computing using OpenStack
Karan Singh and Steve Crago University of Southern California / Information Sciences Institute
June 7, 2012
Objectives
Heterogeneous, virtualized high performance computing (HPC)
testbed
HPC resources available through private cloud
— Resources available remotely for operations, prototypes, experiments and
disadvantaged users
— Dynamic resource provisioning
— Non-proprietary open source cloud software that can be replicated and
extended as needed
Heterogeneous processing resources
— Large x86-based shared memory machine (SGI UV100)
— General-purpose many-core (Tilera TILEmpower)
— GPU-based accelerators (NVidia Tesla)
5
Heterogeneous Processing
Testbed
6
Heterogeneous On-Demand
Processing Testbed
Shared Memory:
•(1) SGI UV100
HPC Cluster
Tiled Processor:
•(10) Tilera TILEmpower
Commodity Cluster and Storage
Storage Array
GPU Cluster:
•(3) Tesla S2050
• 1 SGI Altix UV 100 (Intel
Xeon Nehalem, 128 cores)
• 10 TILEmpower boards
(Tilera TILEPro64 640
cores)
• 3 Tesla 2050s (NVidia Fermi
GPUs, 5,376 cores)
• Commodity cluster (Intel
Xeon Clovertown, 80 cores)
Heterogeneous Processors
Processing Component
Characteristics
SGI UV 100 Shared memory, traditional HPC, x86 processors that support legacy code. Supports KVM and LXC.
Tilera TILEmpower General-purpose many-core, 10x-100x improvement in power efficiency for integer processing, Linux-based C/C++ development environment. Supports bare-metal provisioning.
Nvidia TESLA 2050 Very high performance and efficiency (100x) for regular computational kernels, CUDA development environment. Supports LXC (host).
Heterogeneity: Architectures
CPU: GPU:
1010 samples108 samples
136.2 seconds 139.5 seconds
SGI UV100 rendering 1926 objects
Tilera vs. x86 video transcoding
Infrastructure as a Service (IaaS)
Provides a web services portal and developer tools for managing virtual private clusters, virtual
storage, and virtual machine images
— Images can be provided to users or users can create their own images
— Enables users to access centralized heterogeneous HPC resources through private cloud interface
— Ability to address soft real-time requirements
New machine types so that all of the development tools and higher level services (PaaS and SaaS
or ASP) can access them
— Each machine type requires (or can handle) unique image type (e.g. a GPU requires a GPU executable)
— Each machine type has an image boot process
9
Machine Types:
•SGI Ultra Violet:
uv.small, uv.large, …
•Tilera TileEmpower:
tile.1x1, tile.2x2, …
•Nvidia Tesla GPU
g1.large+s2050
Browser-based, command-line, and programming interfaces
Management of private instances, application machine images, security credentials, network firewalls and addresses, datacenters, etc.
Agent of Innovation: from visionary to viable
Heterogeneity: Virtualization
• 3D parallel rendering system
— Tachyon v. 0.99
— Rendering a scene with 1926 objects
— Shared memory test
0
10
20
30
40
50
60
70
1 16 32 64
S
p
e
e
d
u
p
Number of H/W Threads Used
Speedup of 3D Rendering (Tachyon)
Native (w/o pinning)
KVM w/ pinning
LXC w/ pinning (2 times h/w threads)
LXC w/ pinning
LXC w/o pinning
Agent of Innovation: from visionary to viable
Heterogeneity: GPU Access Methods
0
500
1000
1500
2000
2500
3000
3500
4000
MB
/se
c
Bytes
Host to Device Bandwidth, Pageable
Host
LXC
gVirtus
0
20
40
60
80
100
120
140
160
180
200
80x160
160x320
240x480
320x640
400x800
480x960
560x1120
640x1280
720x1440
800x1600
GFlo
ps/S
ec
Size (NxM), Single Precision Real
Matrix Multiply for Increasing NxM
Host
gVirtus
LXC
Agent of Innovation: from visionary to viable
Heterogeneity: Message Passing
5,403 5,535 5,729 5,840 6,143 9,129
29,217
148,668
386,312
762,745
1,000
10,000
100,000
1,000,000
1 4 16 64 256 1K 4K 16K 32K 64K
To
tal
cycle
s
Message size (words)
Send/Recv (1,000 iterations)
iLib_2.0 MPI_2.0.2 MPI_1.3.5 MPI_2.1.0
Future Plans
• Additional devices
• FPGAs
• Arm cores (Calxeda)
• Next-generation GPUs
• New host virtualization options with GPUs
• Collaboration with Nvidia
• Resource scheduling
• Platform-as-a-Service
• Security hardening
• Application demonstrations
• Deployment
Essex: Architecture and Deployment of Compute Clouds
By Jason Ford, CTO of BlackMesh
CTO of BlackMesh Managed Hosting
Twitter: @bmeshjason and @BlackMesh
Working with virtual technology for five years
Openstack since cactus
BlackMesh formed in 2003
Four datacenters (three in Northern VA and one in Las Vegas NV)
Manage ~650 servers today
About Me and BlackMesh
OpenStack Overview
From: http://ken.pepple.info/openstack/2012/02/21/revisit-openstack-architecture-diablo/
Agenda
Share nothing architecture
Nova: Compute
Swift: Object Storage
Glance: Image Service
Quantum: SDN (Network)
Keystone: Authentication
Horizon: Web Dashboard
Talk about today:
Nova and related services
What the physical layout looks like for deployments
Overall Security
Compute Images
Openstack Overview
Nova Architecture
Nova Services
Nova-api: The heart of Nova. Traffic cop for all other services
Nova-volume: Deals with dynamically attached block storage
Nova-network: Manages networking and vlans
Nova-scheduler: Defines where resources are going to be consumed
Nova-compute: Manages communication between hypervisor and API
Nova Typical Deployment
Typical Non-High Available deployment
Add compute nodes as you grow
All services on one server
Hardware Firewall required for management network
Deployment in High Availability of Nova Services
Allows for maximum uptime and service availability
Note: Nova network and volume not shown
Nova Availability Architecture
No standard except for Ubuntu http://cloud-images.ubuntu.com/
Can add to glance and will just work on Nova compute
Can modify image by mounting
mount –o loop nameofimage.img /mnt
Can install via apt into /mnt --root=/mnt
Cloud-init packages pull meta data
CentOS and Debian create via kvm and libvirt Can use kickstart files
No automated way to pull meta data (right now)
Compute Images
End of part 1 If interested, part 2 will cover nova-volume, nova-network,
quantum (just starting to explore). Post here: http://www.meetup.com/OpenStackDC/
Questions?
[email protected] www.blackmesh.com
The End
OpenStack Bare-Metal Provisioning Framework
Mikyung Kang, David Kang, and Stephen Crago
USC/ISI
June 7th, 2012
Nova-Compute Selection
Create Nova-Compute Driver to manage Bare-Metal machines
Create a filter to classify virtual and Bare-Metal machines
* Reference: Joint(NTT+ISI) bare-metal provisioning framework session in Design Summit 2012
Bare-Metal Flags
--instance_type_extra_specs=cpu_arch:x86_64 --instance_type_extra_specs=cpu_arch:tilepro64 --instance_type_extra_specs=cpu_arch:ARM
Instance Request
Instance types & extra specs
Instance types for Bare-Metal machines
• vcpus: unit of BM
• BM system running a single (SMP) OS
• Usually 1
Use instance_type_extra_specs for more information
• cpu_arch: heterogeneous architecture support
• vcores: # of cores in a BM machine
Capability & Domain
Pre-populated text file for bare-metal machine information Plan to make it DB
Image Provisioning: Tilera
Image Provisioning: PXE
euca-run-instances –t b1.tiny --ramdisk ari-bare –kernel aki-bare ami-a
* Reference: Joint(NTT+ISI) bare-metal provisioning framework session in Design Summit 2012
Current status
General Bare-Metal Provisioning Framework (DONE)
• USC/ISI: OpenStack Upstream: nova/virt/baremetal/*
• Nova-compute w/ bare-metal plug-in (proxy), virtual domain stuff, and tilera-specific back-end code
New features for PXE:X86 machines (DOING)
• NTT docomo: PXE provisioning code with added features such as volume attachment, network isolation, and vnc access
• Waiting for approval to make them open-source (~6/8 or 6/11)
New features for PXE:ARM machines (DOING)
• Calxeda: ARM back-end code
• USC/ISI: ARM instance types and scheduler side
Fault-tolerance of Nova-Compute (bare-metal)
• USC/ISI: bare-metal information DB, fault-detection (master/mirror nova-compute) and fault-recovery
Current Collaboration
USC/ISI • “Mikyung Kang” <[email protected]>
• “David Kang” <[email protected]>
NTT docomo • "Ken Ash” <[email protected]>
• "Mana Kaneko" <[email protected]>
Calxeda • "Ripal Nathuji" <[email protected]>
• "Bob Blair" <[email protected]>
Canonical • "Chuck Short" <[email protected]>
Mirantis • "Roman Bogorodskiy" <[email protected]>
THANK YOU FOR COMING!
Please stay tuned for the next Meet Up!
You will receive a survey & your feedback is greatly appreciated!
Follow us on…
http://twitter.com/OpenStackDC
http://meetup.com/OpenStackDC
http://linkedin.com/groups/OpenStack-DC-4207039
http://www.meetup.com/OpenStackDC/suggestion/
http://www.meetup.com/OpenStackDC/messages/boards/