1 million vms in 100 datacenters by openstack cascading

20
HUAWEI TECHNOLOGIES CO., LTD. Page 1 Huawei Confidential 1 million VMs in 100 datacenters by OpenStack cascading Chaoyi Huang ( [email protected] ) ast edited May 18, 2015

Upload: joe-huang

Post on 05-Aug-2015

202 views

Category:

Software


4 download

TRANSCRIPT

HUAWEI TECHNOLOGIES CO., LTD. Page 1Huawei Confidential

1 million VMs in 100 datacenters by OpenStack cascading

Chaoyi Huang ( [email protected] ) Last edited May 18, 2015

HUAWEI TECHNOLOGIES CO., LTD. Page 2Huawei Confidential

• Background – what is OpenStack cascading• Semi-Simulation test for 1 million VMs in 100 datacenters• Evolution of OpenStack Cascading

HUAWEI TECHNOLOGIES CO., LTD. Page 3Huawei Confidential

DC 1

OpenStack(Vendor1 /

Version 2.0)DC3

DC 2

OpenStack(Vendor2 /

Version 2.1)

OpenStack(Vendor3 /

Version 2.1)

OpenStack cascading is “OpenStack orchestrate OpenStacks” solution for multi-site cloud with unified global OpenStack API exposed

OpenStack API

OpenStack

OpenStack cascading solution

OpenStack API

OpenStack API

OpenStack API

HUAWEI TECHNOLOGIES CO., LTD. Page 4Huawei Confidential

Neutron Server

OVSAgent

Cinder API

Rabbit-MQ

Cinder Volume

Cinder Volume

Nova API

RabbitMQ

Nova Compute

Nova Compute

Libvirt Driver

Nova Driver

LVM Driver

Cinder Driver

KVM Nova LVM Cinder

RabbitMQ

OVS

L3Agent

Linux Router

Scheduler Scheduler

Neutron Agent

Neutron

Magic happens by just considering OpenStack as its own backend !

Glance Image1: Loc1: NFS Loc2: Glance1 Loc3:Glance2

Glance1

Glance2

Image1: Loc1: Ceph

Image1: Loc1: Ceph

Ceilometer API

hBase

StoreStore

CeilometerCeilometer

Nova as hypervisor Cinder as block storage Neutron as networking device Glance as Img. Location Ceilometer as store

** Architecture simplified for illustration only

HUAWEI TECHNOLOGIES CO., LTD. Page 5Huawei Confidential

Neutron

ServerCinder API

Rabbit-MQ

Cinder Proxy

Nova API

RabbitMQ

Nova Proxy

Nova Cinder

RabbitMQ

Scheduler Scheduler

Neutron Proxy(L2/L3/LB/VPN/

FW)

Neutron

The driver/agent to OpenStack is called “Proxy”

Glance

Glance1

Glance2

Ceilometer API

Ceilometer-Proxy

CeilometerCeilometer

Replic-Manager

CascadingLayer

(Control layer, no resources)

CascadedLayer

(All VMs/Volume/…resources running

in the cascaded layer )

Components introduced for cascading

*KeyStone is global service shared or federated by cascading and cascaded layer* Heat will use OpenStack API to do orchestration, no cascading required.

HUAWEI TECHNOLOGIES CO., LTD. Page 6Huawei Confidential

• Background – what is OpenStack cascading• Semi-Simulation test for 1 million VMs in 100 datacenters• Evolution of OpenStack Cascading

HUAWEI TECHNOLOGIES CO., LTD. Page 7Huawei Confidential

Scalability verification

OpenStack …

1 2 100

1 2 1000

OpenStack

…1 2 1000

OpenStack

…1 2 1000

OpenStack

Max. 100 data centers

Max. 100k physical server

nodes

Max. 1 million VMs

OpenStack API OpenStack API OpenStack API

OpenStack API

Scalability inside one data center,

multi-data centers or multi-sites

Test report: http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

HUAWEI TECHNOLOGIES CO., LTD. Page 8Huawei Confidential

Test framework 1. One real cascading OpenStack with 100 simulated cascaded OpenStacks

2. Separated RabbitMQ / MySQL for Each service

3. Active/Standby mode for RabbitMQ / MySQL, not using clustering.

4. Each service with multiple API servers with HAproxy as the load balancer.

5. each cascaded OpenStack managed by one Proxy Node in the cascading OpenStack

6. one cascaded OpenStack configured with one AZ.

7. Most of proxy node and cascaded OpenStack simulator running inside VM

8. Basic assumption: one cascaded OpenStack can manage 1k physical hosts and 10k VMs

Traffic generator: SoapUI/Shell Script

HUAWEI TECHNOLOGIES CO., LTD. Page 9Huawei Confidential

Tested system on limited hardware resources

The semi-simulation test using limited hardware resources (45 servers):Huawei E6000 (27 servers) 2 X Intel(R) Xeon(R) CPU E5645 @ 2.40 GHz

Huawei X6000 (5 servers) 2 X Intel(R) Xeon(R) CPU E5645 @ 2.40 GHz

Huawei RH2288 (12 servers) 2 X Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10 GHz

Huawei RH2285 (1 server) 2 X Intel(R) Xeon(R) CPU E5620 @ 2.40 GHz

Controller nodes and compute nodes use 1GE network interface cards (NICs).

S5352 S5352

E6000

……

X6000 RH2288 RH2285

HUAWEI TECHNOLOGIES CO., LTD. Page 10Huawei Confidential

Load generator

HAProxy Keystone Glance

1 x E6000 1 x RH2285 1 x E6000+ 1 x RH2285

1 x RH2288

Resources allocation

Nova API/nova-conductor

Nova scheduler

Nova RabbitMQ

Nova DB

4 x E6000 +1 x RH2288

2 x E6000 1 x X6000 1 x RH2288

Cinder API CinderRabbitMQ Cinder DB

1 x RH2288 1 x X6000 1 x RH2288

Neutron API Neutron RabbitMQ Neutron DB

5 x E6000 +1 x RH2288

1x X6000 1xRH2288

Proxy Node Cascaded OpenStack Simulator

2 x RH2288 + 2 x E6000+ 2 x X6000 2 x RH2288 + 9 x E6000

1. 4xE6000 Nova-API server not enough, adding one RH2288 during the test, each Nova-API 8 workers, each nova-conductor 20 workers

2. Nova-Scheuduler enhanced to multi-workers. Each one with 18 workers

1. 5xE6000 Neutron-API server not enough, adding one RH2288 during the test

2. Each Neutron server 12 RPC workers and 12 API workers

Cinder is not the bottleneck

HUAWEI TECHNOLOGIES CO., LTD. Page 11Huawei Confidential

Feature tested:

1. Nova Cascading: VM boot, delete, start, stop, reboot, query2. Cinder Cascading: Volume create, delete.3. Neutron Cascading: L2 network (VxLAN), Router (DVR)4. Glance cascading: Glance as the image location5. Shared KeyStone: using PKI token

*Security Group/FWaaS/etc Neutron feature not implemented with cascading

HUAWEI TECHNOLOGIES CO., LTD. Page 12Huawei Confidential

Part of the test casesBackground system: Mixed size tenants with L2/L3 networks scenario:

1. Small-scale tenants (4800): Each tenant has 4 networks and each network provide services for 10 instances. There are altogether 192,000 instances. 100 small-scale tenants across 4 cascaded OpenStacks, others only in one cascaded OpenStacks.

2. Medium-scale tenants (150): Each tenant has 20 networks and each network provide services for 100 instances. There are altogether 300,000 instances. All across 5 cascaded OpenStacks

3. Large-scale tenants (50): Each tenant has 40 networks and each network provide services for 250 instances. There are altogether 500,000 instances. All across 10 cascaded OpenStacks

HUAWEI TECHNOLOGIES CO., LTD. Page 13Huawei Confidential

Part of the test cases

1. Concurrently create 500 virtual machines to networks which already have interface on the router

• Test cases were conducted under the system had been populated with 1 million VMs/Ports• Neutron has the most load pressure. All API requests finished successfully. • If higher concurrency required, more resources needed.

HUAWEI TECHNOLOGIES CO., LTD. Page 14Huawei Confidential

Part of the test cases

2. Execute 500 concurrently API request for add_ router_ interfaces for networks with instances.

• Test cases were conducted under the system had been populated with 1 million VMs/Ports• Neutron has the most load pressure. All API requests finished successfully. • If higher concurrency required, more resources needed.

HUAWEI TECHNOLOGIES CO., LTD. Page 15Huawei Confidential

Part of the test cases

3. Execute 500 concurrently API request for remove _router _interface for networks with instances.

• Test cases were conducted under the system had been populated with 1 million VMs/Ports• Neutron has the most load pressure. All API requests finished successfully. • If higher concurrency required, more resources needed.

HUAWEI TECHNOLOGIES CO., LTD. Page 16Huawei Confidential

Test Conclusion

Conclusion:500 API request concurrency 1 million VMs / 1 million Ports100 simulated cascaded OpenStack

can be in up to 100 data centerswith 100 k physical hosts

work on the limited hardware resources

If security_group, FWaaS, etc feature cascading implemented, need more resources for Neutron to achieve 500 API request concurrency

HUAWEI TECHNOLOGIES CO., LTD. Page 17Huawei Confidential

Issues found in the test

Refer to the test report: http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

HUAWEI TECHNOLOGIES CO., LTD. Page 18Huawei Confidential

• Background – what is OpenStack cascading• Semi-Simulation test for 1 million VMs in 100 datacenters• Evolution of OpenStack Cascading

HUAWEI TECHNOLOGIES CO., LTD. Page 19Huawei Confidential

Evolve to unlimited scalability

CascadingOpenStack(Tenant 1)

…CascadingOpenStack(Tenant 2)

CascadingOpenStack(Tenant x)

Tenant 1 Virtual Resources

Tenant 2 Virtual Resources

Tenant x Virtual Resources

Cascaded OpenStack 2Cascaded OpenStack 1 Cascaded OpenStack y

https://tenant1.OpenStack/ https://tenant2.OpenStack/ https://tenantx.OpenStack/

OpenStack API

OpenStack API

OpenStack API

OpenStack APIOpenStack API OpenStack API OpenStack API

OpenStack APIOpenStack API

OpenStack API

OpenStack API

1. Fully distributed, no central point at all, no scalability bottleneck.2. Unlimited OpenStack instances pool in one cloud or federated clouds. 3. Provide tenant with seamless one OpenStack experience no matter how many OpenStack instances behind

HUAWEI TECHNOLOGIES CO., LTD. Page 20Huawei Confidential

More information:

Wiki: https://wiki.openstack.org/wiki/OpenStack_cascading_solutionPoC Source Code: https://github.com/stackforge/tricircle