meet up august 9 at 6:30pm @warehousedcfiles.meetup.com/2979972/openstack dc_august 9th.pdfthank you...
TRANSCRIPT
OpenStack DC Meet Up
August 9th at 6:30pm @warehousedc
www.meetup.com/OpenStackDC
www.twitter.com/OpenStackDC
THANK YOU TO OUR SPONSOR,
Meet our OpenStack DC Organizers Haisam Ido
Kapil Thangavelu
Matthew Metheny
Eric Mandel
Jason Ford
Kenna McCabe
Ryan Day
PRESENTATIONS
"Ansible, Vagrant and OpenStack on your laptop”
by Lorin Hochstein (@lhochstein)
"High-Performance, Heterogeneous Computing and OpenStack"
by David Kang, Cloud Computing and HPC Engineer at
University of Southern California / Information Sciences
Institute
Vagrant, Ansible and OpenStack on your laptop
Lorin Hochstein Nimbis Services
Email: [email protected] Twitter: @lhochstein
Setting up OpenStack for production is complex and error-prone
2012-08-04 12:31:56 INFO nova.rpc.common [-] Reconnecting to AMQP server on localhost:5672
2012-08-04 12:31:56 ERROR nova.rpc.common [-] AMQP server on localhost:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in
30 seconds.
2012-08-04 12:31:56 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 446, in reconnect
2012-08-04 12:31:56 TRACE nova.rpc.common self._connect()
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 423, in _connect
2012-08-04 12:31:56 TRACE nova.rpc.common self.connection.connect()
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 154, in connect
2012-08-04 12:31:56 TRACE nova.rpc.common return self.connection
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 560, in connection
2012-08-04 12:31:56 TRACE nova.rpc.common self._connection = self._establish_connection()
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 521, in
_establish_connection
2012-08-04 12:31:56 TRACE nova.rpc.common conn = self.transport.establish_connection()
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 255, in
establish_connection
2012-08-04 12:31:56 TRACE nova.rpc.common connect_timeout=conninfo.connect_timeout)
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 52, in
__init__
2012-08-04 12:31:56 TRACE nova.rpc.common super(Connection, self).__init__(*args, **kwargs)
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 129, in
__init__
2012-08-04 12:31:56 TRACE nova.rpc.common self.transport = create_transport(host, connect_timeout, ssl)
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 281, in
create_transport
2012-08-04 12:31:56 TRACE nova.rpc.common return TCPTransport(host, connect_timeout)
2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 85, in
__init__
2012-08-04 12:31:56 TRACE nova.rpc.common raise socket.error, msg
2012-08-04 12:31:56 TRACE nova.rpc.common error: [Errno 111] ECONNREFUSED
Shell scripts are painful, Puppet & Chef have steep learning curves
if [[ $EUID -eq 0 ]]; then
ROOTSLEEP=${ROOTSLEEP:-10}
echo "You are running this script as root."
echo "In $ROOTSLEEP seconds, we will create a user 'stack' and run as that user"
sleep $ROOTSLEEP
# since this script runs as a normal user, we need to give that user
# ability to run sudo
if [[ "$os_PACKAGE" = "deb" ]]; then
dpkg -l sudo || apt_get update && install_package sudo
else
rpm -qa | grep sudo || install_package sudo
fi
if ! getent passwd stack >/dev/null; then
echo "Creating a user called stack"
useradd -U -s /bin/bash -d $DEST -m stack
fi
Source: devstack/stack.sh
Example Ansible play: install ntp
---
- hosts: controller
tasks:
- name: ensure ntp packages is installed
action: apt pkg=ntp
- name: ensure ntp.conf file is present
action: copy src=files/ntp.conf dest=/etc/ntp.conf
owner=root group=root mode=0644
- name: ensure ntp service is restarted
action: service name=ntp state=restarted
Specify hosts in an inventory file
[controller]
192.168.206.130
[compute]
192.168.206.131
192.168.206.132
192.168.206.133
192.168.206.134
Run the playbook
$ ansible-playbook ntp.yaml
PLAY [controller] *********************
GATHERING FACTS *********************
ok: [192.168.206.130]
TASK: [ensure ntp packages is installed] *********************
ok: [192.168.206.130]
TASK: [ensure ntp.conf file is present] *********************
ok: [192.168.206.130]
TASK: [ensure ntp service is restarted] *********************
ok: [192.168.206.130]
PLAY RECAP *********************
192.168.206.130 : ok=4 changed=3
unreachable=0 failed=0
What did Ansible just do?
1. Made SSH connections to remote host
2. Copied over Python modules and arguments parsed from playbook file
3. Executed modules on remote machine
Can run a single action using ansible command
$ ansible controller –m apt –a "pkg=ntp"
192.168.206.130 | success >> {
"changed": false,
"item": "",
"module": "apt"
}
Ansible scripts are idempotent: can run multiple times safely
$ ansible-playbook ntp.yaml
PLAY [controller] *********************
GATHERING FACTS *********************
ok: [192.168.206.130]
TASK: [ensure ntp packages is installed] *********************
ok: [192.168.206.130]
TASK: [ensure ntp.conf file is present] *********************
ok: [192.168.206.130]
TASK: [ensure ntp service is restarted] *********************
ok: [192.168.206.130]
PLAY RECAP *********************
192.168.206.130 : ok=4 changed=1 unreachable=0 failed=0
Use handlers if action should only occur on a state change
---
- hosts: controller
tasks:
- name: ensure glance database is present
action: mysql_db name=glance
notify:
- version glance database
handlers:
- name: version glance database
action: command glance-manage version_control 0
Use templates to substitute variables in config file
keystone.conf: [DEFAULT]
public_port = 5000
admin_port = 35357
admin_token = {{ admin_token }}
keystone.yaml:
hosts: controller
vars:
admin_token: 012345SECRET99TOKEN012345
tasks:
- name: ensure keystone config script is present
action: template src=keystone.conf dest=/etc/keystone/
keystone.conf owner=root group=root mode=0644
Ansible supports multiple modules, can also do arbitrary shell commands
• apt & yum packages
• Stop/start/restart services
• users & groups
• Add SSH public keys
• MySQL & PostgreSQL users & databases
• VMs managed by libvirt
• Git checkouts
Import a new virtual machine (Ubuntu 12.04 64-bit)
$ vagrant box add precise64
http://files.vagrantup.com/
precise64.box
Make a Vagrantfile
Vagrant::Config.run do |config|
config.vm.box = "precise64"
end
Vagrant can also generate this for you: “vagrant init precise64”
Boot it and connect to it
$ vagrant up
[default] Importing base box 'precise64'...
[default] Matching MAC address for NAT networking...
[default] Clearing any previously set forwarded ports...
[default] Fixed port collision for 22 => 2222. Now on port 2200.
[default] Forwarding ports...
[default] -- 22 => 2200 (adapter 1)
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] Mounting shared folders...
[default] -- v-root: /vagrant
$ vagrant ssh
Welcome to Ubuntu 12.04 LTS (GNU/Linux 3.2.0-23-generic x86_64)
* Documentation: https://help.ubuntu.com/
Welcome to your Vagrant-built virtual machine.
Last login: Thu Jun 7 00:49:30 2012 from 10.0.2.2
vagrant@precise64:~$
Boot multi-VMs: configure IPs, memory, hostname
Vagrant::Config.run do |config|
config.vm.box = "precise64”
config.vm.define :controller do |controller_config|
controller_config.vm.network :hostonly, "192.168.206.130"
controller_config.vm.host_name = "controller"
end
config.vm.define :compute1 do |compute1_config|
compute1_config.vm.network :hostonly, "192.168.206.131"
compute1_config.vm.host_name = "compute1"
compute1_config.vm.customize ["modifyvm", :id,
"--memory", 1024]
end
end
Config: controller, one compute host, QEMU, FlatDHCP
controller compute1
eth1 eth1
eth2 eth2 eth0 eth0
NAT NAT
192.168.206.*
.130 .131
192.168.100.*
.130 .131
Vagrantfile describes this setup Vagrant::Config.run do |config|
config.vm.box = "precise64"
config.vm.define :controller do |controller_config|
controller_config.vm.network :hostonly, "192.168.206.130”
controller_config.vm.host_name = "controller"
end
config.vm.define :compute1 do |compute1_config|
compute1_config.vm.network :hostonly, "192.168.206.131”
compute1_config.vm.host_name = "compute1"
compute1_config.vm.customize ["modifyvm", :id, "--memory",
1024]
compute1_config.vm.customize ["modifyvm", :id, "--
nicpromisc3",
"allow-all"]
end
end
If all goes well… $ make all
. . .
-------------------------------------+--------------------------------------+
| Property | Value |
+-------------------------------------+--------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | instance-00000001 |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| accessIPv4 | |
| accessIPv6 | |
| adminPass | CJ8NNNa4dc6f |
| config_drive | |
| created | 2012-08-09T02:51:14Z |
| flavor | m1.tiny |
| hostId | |
| id | 8e9238b8-208d-46a8-8f66-c40660abacff |
| image | cirros-0.3.0-x86_64 |
| key_name | mykey |
| metadata | {} |
| name | cirros |
| progress | 0 |
| status | BUILD |
| tenant_id | 6f29ce771aba46f29f53e178e3b02e66 |
| updated | 2012-08-09T02:51:14Z |
| user_id | ad809727c0a748c9ad12834b6f24b3a1 |
+-------------------------------------+--------------------------------------+
Links
• Vagrantfile & Ansible playbooks for OpenStack:
http://github.com/lorin/openstack-ansible
• Ansible: http://ansible.github.com
• Vagrant: http://vagrantup.com
• Ansible playbook examples: https://github.com/ansible/ansible/tree/devel/examples/playbooks
• Vagrant boxes: http://vagrantbox.es
Image sources
• http://vagrantup.com
• http://ansible.github.com
• http://openstack.org
• http://en.wikipedia.org/wiki/File:Rack001.jpg
• http://en.wikipedia.org/wiki/File:Easy_button.JPG
• http://hezik.nl/enable-ssh-server-on-backtrack-5-r2/
Heterogeneous, High-Performance Cloud
Computing using OpenStack
Dong-In Kang, Steve Crago
, John P. Walters, Mikyung Kang, Jinwoo Suh,
Jeff Burney, and Karandeep Singh University of Southern California /
Information Sciences Institute
August 9, 2012
Objectives
Heterogeneous, virtualized high performance computing (HPC)
testbed
HPC resources available through private cloud
— Resources available remotely for operations, prototypes, experiments and
disadvantaged users
— Dynamic resource provisioning
— Non-proprietary open source cloud software that can be replicated and
extended as needed
Heterogeneous processing resources
— Large x86-based shared memory machine (SGI UV100)
— General-purpose many-core (Tilera TILEmpower)
— GPU-based accelerators (NVidia Tesla)
— Architectures other than x86 (ARM, …)
32
Heterogeneous Processing
Testbed
33
Heterogeneous On-Demand
Processing Testbed
Shared Memory:
•(1) SGI UV100
HPC Cluster
Tiled Processor:
•(10) Tilera TILEmpower
Commodity Cluster and Storage
Storage Array
GPU Cluster:
•(3) Tesla S2050
• 1 SGI Altix UV 100 (Intel
Xeon Nehalem, 128 cores)
• 10 TILEmpower boards
(Tilera TILEPro64 640
cores)
• 3 Tesla 2050s (NVidia Fermi
GPUs, 5,376 cores)
• Commodity cluster (Intel
Xeon Clovertown, 80 cores)
Heterogeneous Processors
34
Processing Component
Characteristics
SGI UV 100 Shared memory, traditional HPC, x86 processors that support legacy code. Supports KVM and LXC.
Tilera TILEmpower General-purpose many-core, 10x-100x improvement in power efficiency for integer processing, Linux-based C/C++ development environment. Supports bare-metal provisioning.
Nvidia TESLA 2050 Very high performance and efficiency (100x) for regular computational kernels, CUDA development environment. Supports LXC (host).
Heterogeneity: Architectures
CPU: GPU:
1010 samples108 samples
136.2 seconds 139.5 seconds
SGI UV100 rendering 1926 objects
Tilera vs. x86 video transcoding
Using Heterogeneous Architecture
in OpenStack
New machine types
— Each machine type requires (or can handle) unique image type (e.g. a GPU requires a GPU executable)
— Each machine type has an image boot process
36
Machine Types:
•SGI Ultra Violet: sh1.small,
sh1.large, …
•Tilera TileEmpower:
tp64.8x8, …
•Nvidia Tesla GPU
cg1.large+s2050
Heterogeneity: Virtualization
3D parallel rendering system
— Tachyon v. 0.99
— Rendering a scene with 1926 objects
— Shared memory test
0
10
20
30
40
50
60
70
1 16 32 64
S
p
e
e
d
u
p
Number of H/W Threads Used
Speedup of 3D Rendering (Tachyon)
Native (w/o pinning)
KVM w/ pinning
LXC w/ pinning (2 times h/w threads)
LXC w/ pinning
LXC w/o pinning
Heterogeneity: GPU Access Methods
0
500
1000
1500
2000
2500
3000
3500
4000
MB
/se
c
Bytes
Host to Device Bandwidth, Pageable
Host
LXC
gVirtus
0
20
40
60
80
100
120
140
160
180
200
80x160
160x320
240x480
320x640
400x800
480x960
560x1120
640x1280
720x1440
800x1600
GFlo
ps/S
ec
Size (NxM), Single Precision Real
Matrix Multiply for Increasing NxM
Host
gVirtus
LXC
How to Support Heterogeneity in OpenStack
Scheduler — Using ‘instance_type_extra_specs’ table in the ‘nova’ DB
| 15 | cpu_arch | s== x86_64 |
| 15 | hypervisor_type | s== LXC |
| 15 | gpu_arch | s== fermi |
| 15 | gpus | = 1 |
— /etc/nova/nova.conf
instance_type_extra_specs=cpu_arch:x86_64, gpus:4, gpu_arch:fermi
— Schedules if both match
— Blueprint (Under Code Review)
https://blueprints.launchpad.net/nova/+spec/instance-type-extra-specs-extension
Baremetal Provisioning
— USC/ISI + NTT Docomo
— Blueprint (Under Code Review)
https://blueprints.launchpad.net/nova/+spec/general-bare-metal-provisioning-
framework
Future Plans
• Additional devices
• FPGAs
• Arm cores (Calxeda)
• Next-generation GPUs
• New host virtualization options with GPUs
• Collaboration with Nvidia
• Resource scheduling
• Security hardening
• Application demonstrations
• Deployment
THANK YOU FOR COMING!
Please stay tuned for the next Meet Up!
You will receive a survey & your feedback is greatly appreciated!
Follow us on…
http://twitter.com/OpenStackDC
http://meetup.com/OpenStackDC
http://linkedin.com/groups/OpenStack-DC-4207039
http://www.meetup.com/OpenStackDC/suggestion/
http://www.meetup.com/OpenStackDC/messages/boards/