livnat peer & arthur berezin, red hat - neutron high availability - openstack israel 2015

37
@Livnat_Peer Sr. Engineering Manager, Red Hat @ArthurBerezin Sr. Technical Product Manager, Red Hat Neutron High Availability OpenStack Israel Tel-Aviv June 2015

Upload: openstack-israel

Post on 13-Aug-2015

269 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

@Livnat_PeerSr. Engineering Manager, Red Hat @ArthurBerezinSr. Technical Product Manager, Red Hat

Neutron High Availability

OpenStack IsraelTel-Aviv June 2015

Page 2: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Agenda

HA Enabling TechnologiesPacemaker and HAProxy

Neutron Built-in Mechanisms DHCP Agent HAL3 Agent with

Virtual Router Redundancy Protocol(VRRP)Distributed Virtual Routing(DVR)

Page 3: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

cc: Morio2015 Source: https://www.wikiwand.com/en/Scuderia_Ferrari

Page 4: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Losing Your Controller

https://www.youtube.com/watch?v=Kb43Nxuwc4I

Page 5: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

High Availability

● Minimize Downtime By Avoiding SPOF ● Service redundancy

○ Active-Active When possible■ Stateless services■ Built-in HA mechanisms

○ Active-Passive for others● Scale out Architecture

Add nodes as you go

Page 6: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015
Page 7: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

HA Enabling TechnologiesPacemaker, HAProxy

Page 8: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

● Cluster Resource Manager● Uses Corosync for cluster communication● Monitor and Control Resources:

○ Floating Virtual IP Address (VIP)○ SystemD/LSB/OCF Services ○ Cloned Services(Active/Active)

● STONITH - Fencing with Power Management○ Important for ensuring data consistency

Pacemaker

Page 9: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

● Virtual IP(VIP)● SystemD Cloned Resource● STONITH Fencing

Pacemaker OpenStack Service

Node 2 - 192.168.1.2Node 1 - 192.168.1.1

pcsd pcsd

Cloned

STONITH STONITH

Service Service

ServiceVirtual IP10.0.0.1

Page 10: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

HAProxy Load Balancer

Load Balancing and Proxy for HTTP/TCP● Mature and popular with web applications● Health Checking ● Load Distribution

Page 11: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

● Load Distribution○ Round Robin, ○ Stick-Table

● API Isolation● Failure Detection

Node 1

Node 2 Node 3

HAProxy Load Balancer

Service Service

HAProxy

Page 12: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Avoiding SPOFsA day in a Highly Available Service Life

Page 13: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller

Give Me Horizon Web UI NOW!

Page 14: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller

Give Me Horizon Web UI NOW!

Single Point Of Failure

Page 15: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Page 16: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Each Could Fail

Page 17: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Pacemaker Cloned Horizon Service

Page 18: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Pacemaker Cloned Horizon Service

Pacemaker Cloned HAProxy Service

Page 19: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Pacemaker Cloned HAProxy Service

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Give Me Horizon Web UI NOW!

Horizon

VIP

Pacemaker Cloned Horizon Service

Page 20: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Neutron Built-in Mechanisms

Page 21: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

● External mechanisms

● Neutron built-in mechanisms

● Reference implementation vs. vendors code

My HA Solution

Page 22: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Architecture - Assuming Centralized Network Node

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq serverNeutron server

OVS agent

OVS

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

InternetExternal Network

APINetwork

Management Network

Data Network

L3 Agent

Page 23: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

DHCP Agent

● IP address allocation is done by the Neutron server

● dnsmasq is used as a distribution mechanism of predefined allocations

● The DHCP protocol allows multiple DHCP servers to co-exist while serving the same pool

● Configuration in Neutron

neutron.conf :

dhcp_agents_per_network = X OVS Agent

Neutron serverOVS

DHCP agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

DHCP Agent

Page 24: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

● Dynamic process creation: dnsmasq, keepalived, metadata proxy etc.

● ProcessMonitor check processes liveliness periodically

● Optional actions:

– Respawn process

– Exit agent

– Notify (not available yet)

● Default configuration

check_child_processes_action = respawn

check_child_processes_period = 0

Process Monitoring

OVS Agent

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

Page 25: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Metadata Agent

OVS

What Else?

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

Page 26: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

What Else?

Metadata Agent

OVS

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

Page 27: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

What Else?

Metadata Agent

OVS

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

Page 28: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

VRRP (Virtual Router Redundancy Protocol)

● Providing HA of the network’s default gateway

● Configuring default gateway as VIP + Virtual MAC

● Gratuitous ARP after failoverSync Net

Page 29: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

L3 HA Implementing VRRP

● Using keepalived which internally implements VRRP

● Creating a per tenant HA network, used for VRRP sync messages

● When HA router is created it is scheduled on multiple network nodes (Configurable)

● New in Kilo

– Report which network node is hosting the master instance

● On the work

– L3 HA + l2pop

– External interface tracking

– L3 HA+DVR

Page 30: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Traffic Flow 3-tier Application

Host 1

WWW

VM

Host 2

App

VM

Host 3

DB

VM

Network Node

Virtual Router

Page 31: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

DVR – Distributed Virtual Router

● DVR is moving most of the routing to the compute node

– Isolating the failure domain of the network node

– Optimizing the network flow

● Traffic types

– East – West (Within the tenant, different networks)

– North – South with floating IP (VM to/from external network)

– North – South without floating IP (Based on SNAT)

Direct between compute nodes

Through network node

Page 32: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

InternetExternal Network

APINetwork

Management Network

Data Network

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Neutron server

OVS agent

OVS

Page 33: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

InternetExternal Network

APINetwork

Management Network

Data Network

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Neutron server

OVS agent

OVS

Page 34: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

Neutron server

OVS agent

OVS

InternetExternal Network

APINetwork

Management Network

Data Network

Compute Node

Neutron server

OVS agent

Neutron serverOVS

L3 agent

Neutron serverMetadata agent

Metadata Proxy

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Page 35: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Summary

● No one stop shop

● Maximize the use of built-in solutions

– They are vendor neutral

– Highly maintained

– Widely documented

● Understand what you need, use the appropriate tools

– DVR vs VRRP

– What size is your deployment, maybe A/P is good enough...

● The more complicated the solution is the more likely it is to have bugs

Page 36: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Thank You

Page 37: Livnat Peer & Arthur Berezin, Red Hat - Neutron High Availability - OpenStack Israel 2015

Resources

● http://assafmuller.com

● http://specs.openstack.org/openstack/neutron-specs/specs/kilo/agent-child-processes-status.html

● https://github.com/beekhof/osp-ha-deploy/blob/master/ha-openstack.md

● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit

● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit

● https://www.youtube.com/watch?v=00j1x-T1vhA