livnat peer & arthur berezin, red hat - neutron high availability - openstack israel 2015
TRANSCRIPT
@Livnat_PeerSr. Engineering Manager, Red Hat @ArthurBerezinSr. Technical Product Manager, Red Hat
Neutron High Availability
OpenStack IsraelTel-Aviv June 2015
Agenda
HA Enabling TechnologiesPacemaker and HAProxy
Neutron Built-in Mechanisms DHCP Agent HAL3 Agent with
Virtual Router Redundancy Protocol(VRRP)Distributed Virtual Routing(DVR)
cc: Morio2015 Source: https://www.wikiwand.com/en/Scuderia_Ferrari
Losing Your Controller
https://www.youtube.com/watch?v=Kb43Nxuwc4I
High Availability
● Minimize Downtime By Avoiding SPOF ● Service redundancy
○ Active-Active When possible■ Stateless services■ Built-in HA mechanisms
○ Active-Passive for others● Scale out Architecture
Add nodes as you go
HA Enabling TechnologiesPacemaker, HAProxy
● Cluster Resource Manager● Uses Corosync for cluster communication● Monitor and Control Resources:
○ Floating Virtual IP Address (VIP)○ SystemD/LSB/OCF Services ○ Cloned Services(Active/Active)
● STONITH - Fencing with Power Management○ Important for ensuring data consistency
Pacemaker
● Virtual IP(VIP)● SystemD Cloned Resource● STONITH Fencing
Pacemaker OpenStack Service
Node 2 - 192.168.1.2Node 1 - 192.168.1.1
pcsd pcsd
Cloned
STONITH STONITH
Service Service
ServiceVirtual IP10.0.0.1
HAProxy Load Balancer
Load Balancing and Proxy for HTTP/TCP● Mature and popular with web applications● Health Checking ● Load Distribution
● Load Distribution○ Round Robin, ○ Stick-Table
● API Isolation● Failure Detection
Node 1
Node 2 Node 3
HAProxy Load Balancer
Service Service
HAProxy
Avoiding SPOFsA day in a Highly Available Service Life
Neutron-Server Controller
Give Me Horizon Web UI NOW!
Neutron-Server Controller
Give Me Horizon Web UI NOW!
Single Point Of Failure
Neutron-Server Controller 1
Neutron-Server Controller 2
Neutron-Server Controller 3
Give Me Horizon Web UI NOW!
HAProxy Controller 1
Neutron-Server Controller 1
Neutron-Server Controller 2
Neutron-Server Controller 3
Give Me Horizon Web UI NOW!
HAProxy Controller 1
Single Point Of Failure
Each Could Fail
Neutron-Server Controller 1
Neutron-Server Controller 2
Neutron-Server Controller 3
Give Me Horizon Web UI NOW!
HAProxy Controller 1
Single Point Of Failure
Pacemaker Cloned Horizon Service
Neutron-Server Controller 1
Neutron-Server Controller 2
Neutron-Server Controller 3
Give Me Horizon Web UI NOW!
HAProxy Controller 1
HAProxy Controller 3
HAProxy Controller 2
Pacemaker Cloned Horizon Service
Pacemaker Cloned HAProxy Service
Pacemaker Cloned HAProxy Service
Neutron-Server Controller 1
Neutron-Server Controller 2
Neutron-Server Controller 3
HAProxy Controller 1
HAProxy Controller 3
HAProxy Controller 2
Give Me Horizon Web UI NOW!
Horizon
VIP
Pacemaker Cloned Horizon Service
Neutron Built-in Mechanisms
● External mechanisms
● Neutron built-in mechanisms
● Reference implementation vs. vendors code
My HA Solution
Architecture - Assuming Centralized Network Node
Compute NodeController Node
Network Node
Neutron server
MySQL server
Neutron server
Neutron serverRabbitmq serverNeutron server
OVS agent
OVS
OVS Agent
keepalived
Neutron serverOVS
DHCP agentDHCP Agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
InternetExternal Network
APINetwork
Management Network
Data Network
L3 Agent
DHCP Agent
● IP address allocation is done by the Neutron server
● dnsmasq is used as a distribution mechanism of predefined allocations
● The DHCP protocol allows multiple DHCP servers to co-exist while serving the same pool
● Configuration in Neutron
neutron.conf :
dhcp_agents_per_network = X OVS Agent
Neutron serverOVS
DHCP agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
L3 Agent
keepalived
DHCP Agent
● Dynamic process creation: dnsmasq, keepalived, metadata proxy etc.
● ProcessMonitor check processes liveliness periodically
● Optional actions:
– Respawn process
– Exit agent
– Notify (not available yet)
● Default configuration
check_child_processes_action = respawn
check_child_processes_period = 0
Process Monitoring
OVS Agent
Neutron serverOVS
DHCP agentDHCP Agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
L3 Agent
keepalived
Metadata Agent
OVS
What Else?
DHCP Agent
Metadata Proxy
dnsmasq
L3 Agent
keepalived
OVS Agent
OVS
Metadata Agent
What Else?
Metadata Agent
OVS
DHCP Agent
Metadata Proxy
dnsmasq
L3 Agent
keepalived
OVS Agent
OVS
Metadata Agent
What Else?
Metadata Agent
OVS
DHCP Agent
Metadata Proxy
dnsmasq
L3 Agent
keepalived
OVS Agent
OVS
Metadata Agent
VRRP (Virtual Router Redundancy Protocol)
● Providing HA of the network’s default gateway
● Configuring default gateway as VIP + Virtual MAC
● Gratuitous ARP after failoverSync Net
L3 HA Implementing VRRP
● Using keepalived which internally implements VRRP
● Creating a per tenant HA network, used for VRRP sync messages
● When HA router is created it is scheduled on multiple network nodes (Configurable)
● New in Kilo
– Report which network node is hosting the master instance
● On the work
– L3 HA + l2pop
– External interface tracking
– L3 HA+DVR
Traffic Flow 3-tier Application
Host 1
WWW
VM
Host 2
App
VM
Host 3
DB
VM
Network Node
Virtual Router
DVR – Distributed Virtual Router
● DVR is moving most of the routing to the compute node
– Isolating the failure domain of the network node
– Optimizing the network flow
● Traffic types
– East – West (Within the tenant, different networks)
– North – South with floating IP (VM to/from external network)
– North – South without floating IP (Based on SNAT)
Direct between compute nodes
Through network node
Architecture - Assuming DVR
Compute NodeController Node
Network Node
Neutron server
MySQL server
Neutron server
Neutron serverRabbitmq server
InternetExternal Network
APINetwork
Management Network
Data Network
Network Node
OVS Agent
keepalived
Neutron serverOVS
DHCP agentDHCP Agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
L3 Agent
Neutron server
OVS agent
OVS
Architecture - Assuming DVR
Compute NodeController Node
Network Node
Neutron server
MySQL server
Neutron server
Neutron serverRabbitmq server
InternetExternal Network
APINetwork
Management Network
Data Network
Network Node
OVS Agent
keepalived
Neutron serverOVS
DHCP agentDHCP Agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
L3 Agent
Neutron server
OVS agent
OVS
Architecture - Assuming DVR
Compute NodeController Node
Network Node
Neutron server
MySQL server
Neutron server
Neutron serverRabbitmq server
Neutron server
OVS agent
OVS
InternetExternal Network
APINetwork
Management Network
Data Network
Compute Node
Neutron server
OVS agent
Neutron serverOVS
L3 agent
Neutron serverMetadata agent
Metadata Proxy
Network Node
OVS Agent
keepalived
Neutron serverOVS
DHCP agentDHCP Agent
Neutron serverMetadata Agent
Metadata Proxy
dnsmasq
L3 Agent
Summary
● No one stop shop
● Maximize the use of built-in solutions
– They are vendor neutral
– Highly maintained
– Widely documented
● Understand what you need, use the appropriate tools
– DVR vs VRRP
– What size is your deployment, maybe A/P is good enough...
● The more complicated the solution is the more likely it is to have bugs
Thank You
Resources
● http://assafmuller.com
● http://specs.openstack.org/openstack/neutron-specs/specs/kilo/agent-child-processes-status.html
● https://github.com/beekhof/osp-ha-deploy/blob/master/ha-openstack.md
● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit
● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit
● https://www.youtube.com/watch?v=00j1x-T1vhA