opennebulaconf 2014 - one bit to rule them all - stefan kooman

36
to rule them all Stefan Kooman ([email protected], @basseroet)

Upload: opennebula-project

Post on 14-Jul-2015

102 views

Category:

Technology


2 download

TRANSCRIPT

to rule them all

Stefan Kooman([email protected], @basseroet)

● BIT is a business to business internet service provider specialized in colocation and managed hosting

● BIT delivers a high quality IT and internet infrastructure for demanding customers

● Reliability is the focus of BIT’s services (redundancy is keyword)

● Operates its own datacenters (Ede, NL) and network (NL, DE, EN)

● IPv6 on all services since 2004!

● ISO 27001 certification (All Services)

BIT

AS12859

~900 peers

3 Transits

BIT Network

● Customers want to know where their data gets stored (compliancy / Privacy Concerns)

● Alternative for shared hosting (customers with special requirements)

● Hybrid Solutions: bare metal servers & BIT VMs possible

● Availability (redundancy)

● ISO 27001

Why choose BIT?(instead of $(PUBLIC_)CLOUD)

● Webshops

● Mission Critical Servers

● SMB Infrastructure servers

● Monitoring servers

● MongoDB (because no presentation is complete without mentioning it, Carlo Daffara, OpenNebulaConf, Berlin 2014)

What runs in our clouds?

● Simple but powerful / flexible (KISS)

● Works out of the box

● Reliable

● (API) Interface(s)

● OSS

● Great community / development organization (OpenNebula Systems)

Why we choose ONE

Cloud setups

Cloudy?

Cloud security

Security (confine risks) Separate VLANs, Storage, Servers, etc.→

Protect against “virtual machine escape” attack (worst case scenario)

Don't break production while exploring new features / setups (test-cloud)

BIT-Cloud(Migrating ONE)→

Before

● Ad-hoc management of KVM hypervisors (virt-manager)

● Mixing of BIT / Customer VMs

● No easy overview of resources (capacity / business continuity planning)

● No integration with BACE (BIT Administration & Configuration Engine)

BIT-Cloud(eat your own dog food)

After

● All VMs centrally managed

● Integration With BACE (hooks, XML-RPC)

● Webinterface (Sunstone) available for (remote) management (GUI tasks) / low level VM troubleshooting (GRML FTW!)

BIT-Cloud(How we migrated it 1/2)

Migration Process (non pxe-based VMs)

● Create VM Templates based on old libvirt xml (virsh dumpxml domain)

● Create Images

● dd if=/old/vm/disk.img of=/var/lib/one/datastores/id/hash bs=1M

● Destroy _and_ Undefine old VM

● Instantiate VM Template

● Profit!

BIT-Cloud(How we migrated it 2/2)

Migration Process (pxe-based VMs) Cloud style!

● Create VM Templates based on old libvirt xml (virsh dumpxml domain)

● Destroy _and_ Undefine old VM

● Instantiate VM Template

● Profit!

Over and out(get rid of your junk)

Good bye junk, Good bye SUN, welcome to a bright cloudy day … ehh whut?!

A bright cloudy day(it actually does exist)

Customer Portal Interface to securely manage all services (DNS, MAIL, VMs, MONITORING, BILLING, etc.)

● For now only

→ stop, start, reboot

→ Out of band management: Console access (KVM)

● Future

→ full fledged provisioning / monitoring / metrics

create, destroy, clone, resize capacity, etc.

ONE & BITIntegration

Multiple Datastores

● NetApp Qtree (NFS) → Provide separation between customers (“partitioning”)

→ Billing (IOPS / Disk Space)

→ Tiering possible (SAS, SSD/SATA)

→ Dedup: 41% savings on customer images, 61% on BIT images

● Future

→ Distributed Object Storage (CEPH, Gluster)

ONE – Storage

Active checks on Front-end / Hosts (bit-monitoring daemon livestatus ↔API)

● Vital ONE functions (oned, sched, VM status, Datastores Capacitity, etc.)

● Hypervisor hardware, Network Bonds

Passive checks

● webservices (VIP's)

Icinga detects NTP out of sync issues within 1 minute after VM live-migration!

ONE – MonitoringIcinga FTW!

● Graph as many metrics as possible (because we can)

→ OS, Apps, Storage, Network

● Trend analysis

● Finding performance issues

ONE – GraphingMunin FTW!

ONE – GraphingMunin FTW!

● Billing network traffic (volume / bandwith)

● Billing model of pay per use instead of 3 monthly contract (possible in future)

ONE – Accountingone.vm.monitoring

● Provisioning Requirements (VM Templates) → Datastores

→ Clusters

→ Hosts

● Custom Attributes (awesome \o/)

→ SCHED_REQUIREMENTS="WINDOWSLICENSED=\"TRUE\""

→ SCHED_REQUIREMENTS="DATACENTER=\”BIT- 1\”"

→ SCHED_DS_REQUIREMENTS="NAME=system_ds_1_kvm_cluster"

ONE – FeaturesFiltering with scheduler

VM_HOOK = [     name      = "notify_running",

   on        = "RUNNING",

   state     = "ACTIVE",

   lcm_state = "RUNNING",

   command   = "notify_running.php",

   arguments = "$ID $TEMPLATE $PREV_STATE $PREV_LCM_STATE" ]

● Executes script “notify_running.php” to register VM in BACE and/or Update Host / Datacenter Location as soon as VMs gets into running state

ONE – FeaturesHooks

VM_HOOK = [     name      = "send_gratuitous_arp",

   on        = "RUNNING",

   state     = "ACTIVE",

   lcm_state = "RUNNING",

   command   = "segrarp.sh",

   arguments = "$ID $TEMPLATE $PREV_STATE $PREV_LCM_STATE",

   remote    = "yes" ]

Send out “gratuitous arp” on hypervisor as soon VM gets into running state

→ Update upstream switches Forwarding Table

→ Update arp cache Routers (only needed if MAC-address changed)

ONE – FeaturesHooks

VM_HOOK = [   name      = "vhid_flow_fix",

   on        = "RUNNING",

   state     = "ACTIVE",

   lcm_state = "RUNNING",

   command   = "vhidflowfix.sh",

   arguments = "$ID $TEMPLATE $PREV_STATE $PREV_LCM_STATE",

   remote    = "yes" ]

ONE / OpenvSwitch Arp Cache Poisoning OpenFlow prevention rules also prevents HA-Setups from working correctly (VRRP / CARP)

→ Add OpenFlow rule for the VRID / VHID MAC-address (00-00-5E-00-01-XX)

ONE – FeaturesHooks

● Live migration over dedicated Network Interfacesdeploy_id=$1

dest_host=$2

HOSTNAME=$(cut ­f1 ­d. <<< $2)

DOMAIN=$(cut ­f2­ ­d. <<< $2)

MIGSUF="migration"

DEST_MIGR_HOST=$HOSTNAME­$MIGSUF.$DOMAIN

exec_and_log "virsh ­­connect $LIBVIRT_URI migrate ­­live $deploy_id $QEMU_PROTOCOL://$DEST_MIGR_HOST/system ­­migrateuri tcp://$DEST_MIGR_HOST" "Could not migrate $deploy_id to $dest_host"

● Examplehost1             IN    A    172.17.17.1

host1­migration   IN    A    10.10.10.1 

ONE – Easily hackable (../remotes/vmm/kvm/kvmrc)

ONE – OneGate

● European Kerio Cloud VMs run on ONE clouds @BIT

We are working on automating / speeding up provisioning process

● Provisioning through:

ONE API (XML-RPC)

OS (ONE contextualized Golden Image, and OneGate for asynchronous communciation channel)

● Configuration management:

Kerio API

ONE – OneFlow(yet to be implemented)

● Autoscaling of Vms based on elasticity rules

→ CPU as inidicator alone is probably not enough (cpufreq scaling hypervisor)

● Should work well in a load-balanced environment

→ Need to integrate with F5 somehow

● Set fixed machine type (QEMU) for MS Windows(TM) VMs to avoid breaking during Virtual Hardware upgrades

● Use Virtio (DISK / NIC) whenever you can

→ PfSense 2.1.5: virtio (1 Gbps) vs intel e1000 (300 Mbps)

● Group VMs together that interconnect a lot: webserver(s) / database(s) (saves inter-host bandwith, increases performance, lowers latency)

● Expose CPU flags when you need them (i.e. HPC / Rendering)

● Disable KSM (Kernel SamePage Merging) if you need all CPU cycles

Do's and Dont's(lessons learned so far)

Group VMs on Host

~ 2 GB/s or ~ 17 Gb/s

ONE - Test-cloud

● 20 Gbps ethernet Migration links

→ Yes, live-migration goes much faster :-) (~600 MB/s)

● VLAN management on Brocade switch done by ONE (NETCONF / RFC 6241)

ONE - Test-cloud

● Administer ALL IP's in ONE

→ Enable IP Aliasses ((web)servers in need of extra IP's)

→ Possibility to have OpenFlow “arp cache poisoning” and “IP Hijack prevention” rules enabled

→ Contextualization adjustments needed to handle extra IP's

● Network Integration (SDN)

ONE – ChallengesYet to overcome

● Inter OpenNebula Clouds: Ultimate Hybrid

→ Complementary to Federation: separate administrative boundaries

● ONE CX (Cloud Exchange)

ONE – Future(let's connect them all)

#1727: Resize disk images#2347: [anti-]affinity functionality for VMs to be placed in the same physical host #2650: Re-read oned.conf on reload#3181: IPv6 hijacking prevention#2921: (Per VM) DISKIO IO information in Sunstone#2648: ACL edit/view wizard#3015: Multi (domain) LDAP authentication support#2925: Ability to filter in sunstone on resource usage (CPU, RAM, NETWORK, DISK)

(and thanks for implementing “Multiple Datastores”, Federation, Clone between datastores, VLAN trunks, Address Range (AR), etc. countless others)

ONE – I want(feature requests

aka Sinterklaasverlanglijstje)

Thanks for your attention!

And thank you very much OpenNebula Systems /

Netways!