non-disruptive upgrade to suse openstack cloud 7 · upgrading the administration node preliminary...

Post on 13-Aug-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Non-disruptive upgrade to

SUSE® OpenStack Cloud 7

Nanuk Krinner Rick Salevsky

SUSE Cloud Engineer SUSE Cloud Engineer

nkrinner@suse.com rsalevsky@suse.com

Introduction

2

Speakers

● Nanuk Krinner○ Cloud Developer at SUSE

○ Systems Management Engineer

● Rick Salevsky○ Cloud Engineer at SUSE

○ Release Coordinator

● SUSE OpenStack Cloud

3

Agenda● Upgrading

● Why and What

● Non-disruptive Upgrade

○ Process

○ Requirements

○ Upgrading the Administration Node

○ Preparing the Client Nodes

○ Upgrading the Controller Nodes

○ Upgrading the Compute Nodes

○ Finalizing Upgrade

● Goals4

Upgrading?

5

Upgrading?

6

Day 1

Deploy

Upgrading?

7

Day 1

Deploy

Day 2

Operate

Upgrading?

8

Day 1

Deploy

Day 2

Operate

Day 3

Upgrade

Why and What

9

Why upgrading?

● Security fixes

● Stability improvements

● Performance improvements

● Closely follow upstream development

● New features

● Stay on a supported release

10

Problems while upgrading?

● Downtime

● Preparation

● Testing

● Adapting workflows

● Bugs

● Data loss

11

OpenStack User Survey April 2016

Customer demands

● Rolling upgrade

● Non-disruptive upgrade

● Easy way to cancel

● Clear documentation of what is happening

● Upgrade while skipping one or more releases

12

Upgrade marathon

13

ReleaseEvaluating the

release

Planning the

upgrade

Testing the

upgrade

Fine tuningIntegrating

new featuresUpgrade

Upgrade marathon

14

ReleaseEvaluating the

release

Planning the

upgrade

Testing the

upgrade

Fine tuning

Maybe not

this time?

Integrating

new featuresUpgrade

Non-disruptive Upgrade

15

Process

16

Process

● Non-disruptive for workloads○ HA as requirement

○ network downtime without HA

● Tools○ WebUI

○ command line tool (crowbarctl)

○ new REST-API (for manual upgrade)

17

Requirements

18

Requirements

● Maintenance Updates installed

● Cloud Network Services are healthy

● Pacemaker is available and healthy

● Ceph is healthy

● Compute Resources are available

19

Upgrading the Administration Node

20

Upgrading the Administration Node

● Preliminary Checks

● Non-disruptive Mode or Normal Mode

● Begin Upgrade○ Decouple node from crowbar

○ Disabling chef on nodes

○ Freezing the Cloud in the current state

● Backup of the admin node○ Optional but recommended

○ In case the admin node upgrade fails

21

Upgrading the Administration Node

● Update Repositories○ Manual by the Administrator

○ SLES 12 SP2

○ SUSE OpenStack Cloud 7

○ Updates Repositories

● Upgrade Administration Node OS○ Background Script

○ Stops chef-client service

○ Dumps current database

○ Executes zypper dist-upgrade

● Rebooting22

Upgrading the Administration Node

● Creating PostgreSQL database ○ Create new local database

○ Connect to existing database

● Migrating old data to new database

23

Preparing the Client Nodes

24

Preparing the Client Nodes

Before Upgrading

● Backup all important OpenStack data

● Create snapshots from important instances

● Last chance to update OpenStack resources

25

Preparing the Client Nodes

● Prepare client node repositories○ SLES 12 SP2

○ SUSE OpenStack Cloud 7

○ SLES 12 SP2 Height Availability Extension

○ Updates Repositories

● Automatic disabling old Repositories○ SUSE OpenStack Cloud 6

○ SLES 12 SP1

26

Preparing the Client Nodes

● Stopping Services○ Irrelevant OpenStack services → network

○ Related OpenStack services

● Creating backup of the OpenStack database○ Backup is stored on the Administration Node

● OpenStack API will is mostly unavailable

27

Upgrading the Controller

Nodes

28

SLES 12 SP1

Upgrading the Controller Nodes

29

DHCP

Neutron

OVS

L3

RabbitMQ

Keystone

DB

SLES 12 SP2

Admin Network

P

a

c

e

m

a

k

e

r

P

a

c

e

m

a

k

e

r

Upgrading the Controller Nodes

● First node to upgrade will a non master node

● Migrating neutron l3-agent

● Shutdown Pacemaker services

● Upgrade Controller Node OS

● Reboot but not start any services

● Prevent Pacemaker of running services on non

upgraded nodes

● Core API downtime start now → all services

30

SLES 12 SP1

Upgrading the Controller Nodes

31

DHCP

Neutron

OVS

L3

RabbitMQ

Keystone

DB

SLES 12 SP2

Admin Network

P

a

c

e

m

a

k

e

r

P

a

c

e

m

a

k

e

r

Upgrading the Controller Nodes

● Start Pacemaker Services

● Update all configurations via crowbar

● Stop synchronizing HA ressources

● Promote upgrades node to master

● Start all services on the upgraded node

● Core API downtime ends here

32

SLES 12 SP1

Upgrading the Controller Nodes

33

DHCP

Neutron

OVS

L3

RabbitMQ

Keystone

DB

SLES 12 SP2

Admin Network

P

a

c

e

m

a

k

e

r

P

a

c

e

m

a

k

e

r

Neutron

RabbitMQ

Keystone

DB

DHCP

OVS

L3

Upgrading the Controller Nodes

For all other controller nodes each at a time:

● All services are already stopped

● Moving network traffic to upgraded node

● Sync HA slave with master

● Stopping cluster stack

● Upgrade Controller Node OS

34

Upgrading the Controller Nodes

Normal Mode (non High Availability case)

● Several Network outages will happen

● Specially during network migrations

35

Upgrading the Compute

Nodes

36

Upgrading the Compute Nodes

For all compute nodes each at a time:

● Disabling nova hypervisor

● Live migrate instances to another compute node

● Stop Pacemaker remote

● Upgrading node OS

● Rebooting node

● Update all configurations via crowbar

● Adding node to pacemaker cluster

37

Finalizing Upgrade

38

Finalizing Upgrade

● Reapplying all barclamps

● Showing the barclamps page

39

Issues● Configuration file migration

● Migrations

● All or nothing

● Predefined upgrades

● Create backups!

40

Goals

41

Goals

● Finish non-disruptive upgrade for other services

● No downtime of important services

● Migrating existing data from every point

● Cancel upgrade in every step

● Rollback upgrade

42

Questions?

43

Nanuk Krinner Rick Salevsky

SUSE Cloud Engineer SUSE Cloud Engineer

nkrinner@suse.com rsalevsky@suse.com

Thank you!

44

Nanuk Krinner Rick Salevsky

SUSE Cloud Engineer SUSE Cloud Engineer

nkrinner@suse.com rsalevsky@suse.com

top related