dissecting open source cloud evolution: an openstack case study

37
Dissecting Open Source Cloud Evolution: An OpenStack Case Study Salman Baset, Chunqiang Tang, Byung Chul Tak, Long Wang IBM T. J. Watson Research Center June 26 th , 2013

Upload: salman-baset

Post on 15-Apr-2017

61 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

Dissecting Open Source Cloud Evolution: An OpenStack Case Study

Salman Baset, Chunqiang Tang, Byung Chul Tak, Long Wang

IBM T. J. Watson Research CenterJune 26th, 2013

Page 2: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

Open source cloud projects

IaaS

PaaS

SaaS

Broadly two types:(1) Native (listed here)(2) Adapters (e.g., Netflix on EC2)

S. Baset, CQ Tang, B. Tak, L. Wang 2

Page 3: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

Timeline for cloud open source

2006 2007 2008 2009 2010 2011 2012

Amazon EC2 Google AppEngine

2005

2001

3

Page 4: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Two characteristics of open source cloud systems

• Distributed multi-component architecture– Example: OpenStack and Cloud Foundry have more than 10 components for

their IaaS controllers• Rapid development by a community of developers

4

Page 5: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Rapid development

• Open source cloud projects are being developed and released at a rapid pace– OpenStack: releases every six months– Eucalyptus: every four months– OpenShift Enterprise: every four months

• Compare it to– Linux kernel: 2-3 months (3.x – 3.(x+1) )– Ubuntu distro releases: every six months

• Major cloud providers are consuming OpenStack directly from development trunk– Two weeks behind the trunk

5

Page 6: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Why understand evolution?

• Evolution:– A git commit or a major release

• Research perspective– How logical operations (e.g., create a VM) change across major versions?

• Developer perspective– What is the impact of my committed changes?

• Provider perspective– Continuous deployment and delivery

• How does a provider gain confidence in deploying a new release in production?• What is the impact of new changes and configuration options on logical operations?

– Message flow, performance evaluation, fault injection etc

6

Page 7: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Methods for understanding evolution

• Static– Source code– Documentation

• Dynamic– Log analysis

• Lab and/or production

– Tracing message flow• With or without code instrumentation• Automatic correlation of message flow with logs• Lab and/or production

– Fault injection– Performance study

• Lab

7

Page 8: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Our solution

• Without source code modification– Tracing– Tracing with log correlation– Fault injection

• Other solutions– Google Dapper (built RPC framework leveraging callbacks)– Twitter Zipkin (attach identifiers to requests)

8

Page 9: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

9

Summary of our solution: Tracing

• This simplified diagram shows one example path for one user request.• A path is the series of system events such as RECEIVE and SEND across servers

captured using LD_PRELOAD technique.• Prior art: vPath constructs such causal path of system activities initiated by user

requests.

thread

RECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Apache webserver

thread

RECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Application server

thread

RECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Database server

Request

SEND RECEIVE

SEND

SEND

SEND

RECEIVE SEND

Page 10: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

10

Summary of our solution: Tracing with queues

• The path breaks if there are queues in the middle.– Apache web server inserts a message in the queue and returns– Application server retrieves the message from the queue and performs work– How do we correlate these messages?

• Augment path information with unique message information– e.g., transaction ids

• Run only one logical operation in the system if no unique message information

thread

RECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Apache webserver

threadRECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Application server

thread

RECEIVE

Monitoring Agent

events caught

application

kernel

Ex) Database server

Request

SEND RECEIVE

SEND

SEND

SEND

RECEIVE SEND

Queue

Page 11: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

11

Summary of our solution: Log Analysis

• Key idea– Combine the log information and causality (path) discovery technique

Trace low-level system calls to infer causality and understand how

an application executes

Monitor log files and link log file entries to observed low-level

system calls

Link together

Improved Semantics for

Problem Diagnosis

Page 12: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

12

Diagram: Detecting Log Writes

• During normal run, – Maintain a mapping between fd and file name string– Maintain a list of known/discovered log files

• On ‘write’ system calls,– Check parameters and see if it is a ‘write’ on one of the log files.– If it is, and the data to be written contains alerting keywords such as ‘ERROR’, then this is

a log write due to some errors.– This ‘write’ event will be annotated appropriately.

Recv Read write SendRequest

Websphere /var/log/was.logDB2 /var/log/db2/access.logDB2 /usr/local/db2/fie22xlv.logDB2 /usr/local/db2/fie23xlv.log

log file name

<Fragment of a Path>

Parametersfd=5,offset=2048,data=“ERROR: …”

91458

fd application

Page 13: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

13

Fault Injection for Building up Knowledge Base for Future Problem Diagnosis

• Injects errors, observe application’s behavior, and build a knowledge base for future problem diagnosis

– Alters a return value of a system call, e.g., mimic network communication error– It observes the logging reaction.– It repeats this for each system call and for each requests.– It accumulates the observed logging reactions as a knowledge base.

• When an error message is logged in a production system, using the knowledge base to infer the probability of different root causes

– Construct Bayesian Belief Network for inference

• In the example figure, fault injection changes the return value of ‘Read’ event to -1. This triggers an error to be logged at the later part of the path.

Recv Read write SendRequest Recv write

Return value: 1024

Return value: -1

Parameterdata=“ERROR: Record missing.”

Newly appeared event

Reaction to our error injection

Altered

Page 14: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 14

Brewing complexity: Evolution of OpenStack loc *

Released Nova Cinder Glance Keystone Quantum Swift Total

Austin Oct 2010 17,288 12,979 30,627

Bexar Feb 2011 27,734 3,629 16,014 47,377

Cactus Apr 2011 43,947 4,927 16,665 65,539

Diablo Sep 2011 66,395 9,961 12,451 15,591 91,947

Essex Apr 2012 87,750 15,698 11,555 17,646 149,596

Folsom Sep 2012 103,637 31,241 20,271 13,939 42,118 19,114 230,320

Grizzly Apr 2013 120,968 49,797 21,261 20,071 60,485 23,035 321,081

* CRLF and not python loc

Methodologywc -l `find . | grep -E '*\.py' | grep -v test | grep -v 'doc'`wc -l `find . | grep -E '*\.sh' | grep test | grep -v 'doc'`

Page 15: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

nova database

nova-api

nova-scheduler

nova-compute

dashboard (horizon)

keystone

glance-api

glance-registry

glance database

glance API (REST)

AMQPdatabase keystone

OpenStack logical architecture (grizzly+net+cinder)

15

keystonedatabase

RESTREST

AMQP

nova

nova-conductor

cinder-api

cinderdb

AMQPcinder

cinder-volume

cinder-scheduler

nova-networknova-cert

nova-cells

Compute nodes

Volume nodesS. Baset, CQ Tang, B. Tak, L. Wang

IMAGE REPO

BLOCK STORAGE

AUTHENTICATION COMPUTE CONTROLLER

Page 16: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

nova database

nova-api

nova-scheduler

nova-compute

dashboard (horizon)

keystone

glance-api

glance-registry

glance database

glance API (REST)

AMQPdatabase keystone

OpenStack logical architecture (grizzly+quantum+cinder)

16

keystonedatabase

RESTREST

AMQP

nova

nova-conductor

cinder-api

cinderdb

AMQPcinder

cinder-volume

cinder-scheduler

nova-cert

nova-cells

quantum-server

quantumdbAMQP

quantum

quantum-dhcp

quantum-plugin

quantum-metadata agent

Compute nodes

Volume nodes

quantum-l3 agent

quantum-l3 agent

IMAGE REPO

BLOCK STORAGE

AUTHENTICATION COMPUTE CONTROLLER

NETWORK CONTROLLER

Page 17: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

OpenStack tracing

• Understand OpenStack data and message flow for logical operations, e.g.,– Create a VM– Delete a VM– List VMs– Create a volume– Add or remove volume to a VM– Create a floating IP address– Add or remove floating IP address from a VM– Create or destroy a virtual network

• Understand– REST calls– Data flow– AMQP flow– Timing information

17

• Build data consistency tool• Gather data for generating performance load• Build a performance model

Page 18: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 18

Key observations from tracing OpenStack (1/2)

• OpenStack is evolving very rapidly. Significant behavior changes from one release to another.• Total tables

– Grizzly: 105 tables (160 with nova shadow tables), 53 in Diablo• Creating a VM (grizzly)

– 139 SELECT queries, 37 INSERT queries, 74 UPDATE queries– 12 tables are touched for INSERT and UPDATE

• In Diablo (Sep 2011), there were 450 SELECT, 4 INSERT, and 9 UPDATE queries– 717K read, 458K write– 655 send() calls to AMQP, 414 recv() calls

• Deleting a VM– Only single record is deleted from database (rest are archived)

• Request-id– Instance and request-id are stored in database (but only after updating quota) and before a request

is sent to the scheduler.• Quota management

– Entries are inserted in database to indicate resource allocation for a VM. Negative or NULL entries are inserted for deallocation. Each quota entry has expiration time (one day). E.g., core, fixedIP etc.

• VM state and task state– networking, block_device_mapping, spawning

• Keystone– Token verification is optimized in Grizzly using caches (for flavor=keystone) and PKI

18

Page 19: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 19

Key observations from tracing OpenStack (2/2)

• Development of a data consistency checking tool– Orphan iptable rules (not associated with VM transaction) => security holes– Orphan data in tables due to errors in VM creation etc => audit and clean up– Orphan virsh data => audit and clean up

S. Baset, CQ Tang, B. Tak, L. Wang 19

Page 20: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 20

Methodology

• Run OpenStack in a machine (w/ and w/o timers disabled)• Diablo, Essex, Folsom, Grizzly• Ubuntu, RabbitMQ, MySQL• Use curl to send API request to OpenStack

– flavor=keystone– Image has three parts

• AMI, ram disk, kernel image

– For keystone, PKI based token verification also used in grizzly– Each service’s token were created before issuing a create or delete VM call

• Use our technique to capture message interaction, generate flow, run message analytics, and insert faults (on going)

• curl_createserver.sh AUTHTOKEN=$1curl -i http://9.47.240.166:8774/v2/3283d689d02c41248fc82c202e82055a/servers -X POST -H "X-Auth-Project-Id: admin" -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: ${AUTHTOKEN}" -d '{"server": {"name": "test1", "imageRef": "de8882fb-94b3-4105-a212-c0a7fd8ab1e9", "flavorRef": "1", "max_count": 1, "min_count": 1, "networks": [{"uuid": "48de54f9-2a60-4f28-9740-d6317086c32a"}] }}'

S. Baset, CQ Tang, B. Tak, L. Wang 20

Page 21: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

21

SQL queries in create, delete, list VMs and tables touched How to read: Tables touched (SQL queries) – [no of tables with INSERT or UPDATE]

Diablo(Sep 2011)

Essex(Apr 2012)

Folsom (Sep 12nova-network

Folsomquantum

Grizzly (April 12)nova-network

Grizzlyquantum

SELECT (create) 16 (450) 17 (95) 21 (409) 26 (560) 20 (139) 37 (343)SELECT (delete) 8 (37) 10 (36) 17 (63) 23 (241) 13 (36) 31 (192)SELECT (list) 5 (31) 4 (12) 6 (24) 7 (25) 1 (1) 1 (1)INSERT (create) 4 (4) 4 (4) 8 (23) 9 (24) 10 (37) 13 (40)INSERT (delete) 0 (0) 0 (0) 1 (3) 1 (3) 3 (6) 4 (6)

INSERT (list) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)

UPDATE (create) 2 (9) - 5 3 (12) - 5 7 (60) - 11 7 (59) - 13 8 (74) – 13 8 (70) - 16UPDATE (delete) 4 (6) - 4 6 (10) - 6 8 (22) - 9 8 (25) - 9 10 (31) - 11 10 (26) - 12

UPDATE (list) 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0

DELETE (create) 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0 0 (0) - 0

DELETE (delete) 1 (1) 1 (1) 1 (1) 1 (1) 1 (1) 1 (1)

Tables 534 (glance)

9 (keys)39 (nova

634 (glance)

10 (keystone)49 (nova)

675 (glance)

10 (keystone)52 (nova)

67 (net)/83 q16 (quantum)

+ folsom

6 (glance)19 (keystone)

111 (nova)55 shadow nova tb

136 (net)/160q24 (quantum) +

grizzly

S. Baset, CQ Tang, B. Tak, L. Wang21

Page 22: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 22

Keystone REST flow for creating a server (grizzly)

22

User Keystone nova-api glance-apiCredentials

Token (role)

Get services and endpoints + token

Services + endpoints

Token + CreateInstance

Verify + tokenToken + GetImage

Verify + token

image

CreateInstance Success

Accepted

glance-registry

Token + GetImage

Verify + token

image

Page 23: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 23

Create a VM: overview (1/4)• Which OpenStack component is issuing SELECT queries?

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Girzzlyquantum

Auth. keystone 422 54 358 484 82 243API server nova-api 4 11 11 9 10 10Agent on compute node

nova-compute

4 5 13 14 0 0

Controller agent

nova-conductor

n/a n/a n/a n/a 15 16

Network agent on compute

nova-network

13 19 17 n/a 20 n/a

Scheduler nova-scheduler

1 2 1 1 4 4

Image registry server

glance-registry

6 4 8 8 8 8

Network API server

quantum-server

n/a n/a n/a 44 n/a 62

23

Page 24: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

24

Create a VM: overview (2/4)• How many HTTP requests with respect to SELECT calls? Red indicates REST calls rcvd.

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

keystone 422 54 358 484 82 24330 GET 9 GET 17 GET 23 GET 3 GET 6 GET, 2POST

nova-api 4 11 11 9 10 101 POST 1 POST 1 POST 1 POST 1 POST 1 POST

nova-compute 4 5 13 14 0 0nova-conductor n/a n/a n/a n/a 15 16nova-network 13 19 17 n/a 20 n/anova-scheduler 1 2 1 1 4 4glance-api 0 0 0 0 0 0

2 GET, 5 HEAD

4 HEAD 8 HEAD 8 HEAD 8 HEAD 8 HEAD

glance-registry 6 4 8 8 8 87 GET 4 GET 8 GET 8 GET 8 GET 8 GET

quantum-server n/a n/a n/a 44 n/a 625 GET, 1 POST 9 GET, 1 POST

24S. Baset, CQ Tang, B. Tak, L. Wang

Page 25: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Why so many SELECT queries in keystone?

• In Diablo, for every keystone GET, 14 SELECT queries are issued, except for first query (16)• In Essex, for every keystone GET, 6 SELECT queries are issued• In Folsom-nova-net/quantum, for every keystone GET, 21 SELECT queries are issued, except

for first query (22)• In Grizzly-nova-net, 27 SELECT queries for each request except for first (1).

– Keystone tokens are also cached. So subsequent queries do not result into full keystone token authentication

• If PKI token verification is used, the number of SELECT queries sent by keystone drop to 7 from 82.

25

keystone 422 54 358 484 82 24330 GET 9 GET 17 GET 23 GET 3 GET 6 GET, 2POST

Page 26: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 26

Create a VM: overview (3/4)• What if there is no keystone?Keystone enabled

Keystone disabled

S. Baset, CQ Tang, B. Tak, L. Wang 26

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

SELECT 28 41 51 76 57 100

INSERT 4 4 23 24 37 38

UPDATE 6 10 60 58 74 70

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

SELECT 450 95 409 560 139 343

INSERT 4 4 23 24 37 40

UPDATE 6 10 60 58 74 70

Page 27: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

27

Create a VM: overview (4/4)• Which components are issuing INSERT and UPDATE queries? (keystone enabled for all)

INSERT Diablo Essex Folsomnova-network

Folsomquantum

Grizzlynova-network

Grizzlyquantum

keystone 2

nova-api 3 (3) 3 (3) 6 (10) 6 (10) 7 (21) 7 (21)

nova-compute 1 (12) 2 (12)

nova-conductor 2 (13) 2 (13)

nova-network 1 1 1 2

nova-scheduler 1 1

quantum-server 2 3

S. Baset, CQ Tang, B. Tak, L. Wang27

UPDATE Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzlynova-network

Grizzlyquantum

nova-api 1 1 9 9 7 7

nova-compute 1 (5) 1 (6) 4 (47) 4 (47)

nova-conductor 5 (59) 5 (59)

nova-network 3 4 3 6 1

nova-scheduler 1 1 1 2 2

quantum-server 1

Page 28: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 28

Grizzlynova-net

SELECT 2 block_device_mapping 6 compute_node_stats

6 fixed_ips 1 floating_ips

8 images 4 instance_actions

2 instance_actions_events 1 instance_info_caches

4 networks 2 provider_fw_rules

5 quotas 4 quota_usages

2 reservations 7 role

1 security_group_rules 3 security_groups

4 virtual_interfaces

S. Baset, CQ Tang, B. Tak, L. Wang 28

Grizzlynova-net

INSERT 12 compute_node_stats 1 instance_actions

2 instance_actions_events 1 instance_id_mappings 1 instance_info_caches

1 instances 13 instance_system_metadata

4 reservations 1

security_group_instance_association 1 virtual_interfaces

Grizzlynova-net

UPDATE 6 compute_nodes 44 compute_node_stats

3 fixed_ips 2

instance_actions_events 1 instance_info_caches

8 instances 8 quota_usages

2 reservations

Tables touched for create VM in grizzly-nova-net

Page 29: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

29

Dataflow flow for creating a server (grizzly) (1/2)

29

nova-api nova-scheduler nova-conductor nova-compute

Create server Check quota

INSERT INTO reservations (instances, expires, usageid1)INSERT INTO reservations (ram, expires, usageid2)INSERT INTO reservations (core, expires, usageid3)UPDATE quota_usages (usageid1)UPDATE quota_usages (usageid2)UPDATE quota_usages (usageid3)

Check if images exist

INSERT INTO instances (‘instance_uuid’)

INSERT INTO security_group_instance_association (‘instance_uid’)INSERT INTO instance_system_metadata (‘image_kernel_id, instance_uuid’)INSERT INTO instance_system_metadata (‘instance_type_memory_mb’)INSERT INTO instance_system_metadata (‘instance_type_swap’)INSERT INTO instance_system_metadata (‘instance_type_vcpu_weight’)INSERT INTO instance_system_metadata (‘instance_type_root_gb’)INSERT INTO instance_system_metadata (‘instance_type_id’)INSERT INTO instance_system_metadata (‘image_ramdisk_id’)INSERT INTO instance_system_metadata (‘instance_type_name’)INSERT INTO instance_system_metadata (‘instance_type_ephemeral_gb’)INSERT INTO instance_system_metadata (‘instance_type_rxtx_factor’)INSERT INTO instance_system_metadata (‘instance_type_flavorid’)INSERT INTO instance_system_metadata (‘instance_type_flavorid’)INSERT INTO instance_system_metadata (‘image_base_image_ref’)INSERT INTO instance_info_caches (‘instance_uuid)

Create reservations. No request id. Default: expires aftera day if not updated.

Update quotas.

What if nova-api dies here? Then quota updatescan potentially be permanent until expired or cleanup.

Create instance in the database.

Page 30: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 30

Dataflow flow for creating a server (grizzly) (2/2)

30

nova-api nova-scheduler nova-compute nova-conductor

INSERT into instance_id_mappings(‘instance_uuid’)

Update time in quota_usages table

INSERT INTO instance_actions (instance_uuid, request_id)

Send to scheduler (request_id)

INSERT into instance_action_events(scheduling)

nova-network

INSERT into instance_actions_events(compute_run)

Libvirt – create instance

UPDATE instances (task_state = NULL)

GET images from glance

UPDATE instances (host, node)

UPDATE compute_node_stats *

INSERT INTO compute_node_stats

UPDATE instances (task_state=networking)

This request is key. It associates instance idwith a request id. But occurs after quota and reservations has been updated. BAD!!!

Page 31: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 31

How many SQL queries for create VM before a request is sent to:

S. Baset, CQ Tang, B. Tak, L. Wang 31

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

SELECT 202 10 27 289 98 138

INSERT 0 0 3 10 21 21

UPDATE 0 0 3 9 7 7

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

SELECT 371 52 292 290 100 140

INSERT 3 3 10 10 22 22

UPDATE 1 2 10 10 8 8

scheduler

compute

Diablo Essex Folsom-nova-network

Folsom-quantum

Grizzly-nova-network

Grizzlyquantum

SELECT 450 95 409 560 139 343

INSERT 4 4 23 24 37 40

UPDATE 6 10 60 58 74 70

Page 32: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 32

Create VM total message bytes – read() or recv()

S. Baset, CQ Tang, B. Tak, L. Wang 32

Diablo Essex Folsomnova-network

Folsomquantum

Grizzlynova-network

keystone 154841 23090 198493 269920 41888

nova-api 65596 81836 75507 21435 22766

nova-compute 155233(113701)

157660(105460)

202163(163107)

206003(167383)

106396(110721)

nova-conductor n/a n/a n/a n/a 371614

nova-network 98101 77184 62509 n/a 103100

nova-scheduler 3380 38477 16465 19688 29674

glance-registry 36764 16632 45798 46104 30494

glance-api 17440 6326 32386 32716 11248

quantum-server n/a n/a n/a 46533 n/a

quantum-dhcp n/a n/a n/a 3722 n/a

Total 531355 401205 582185 650,615 717,180

Excludes any image transfer

Page 33: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 33

Create VM total message bytes – write() or send()

S. Baset, CQ Tang, B. Tak, L. Wang 33

Diablo Essex Folsomnova-network

Folsomquantum

Grizzlynova-network

keystone 115606 15129 128957 174884 25364

nova-api 50704 70995 25449 20265 22693

nova-compute 99899 109136 127436 126143(122363)

74864(68352)

nova-conductor n/a n/a n/a n/a 222228

nova-network 74106 63446 46123 n/a 57321

nova-scheduler 2964 30182 17662 21993 26997

glance-registry 23095 11006 18210 18196 20329

glance-api 8841 5038 10226 10220 8705

quantum-server n/a n/a n/a 25986 n/a

quantum-dhcp n/a n/a n/a 84 n/a

Total 375,447 305,156 374,499 403,507 458,501

Page 34: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 34

Create a VM: Message exchange with RabbitMQ – send()

Diablo Essex Folsomnova-network

Folsom-quantum

Grizzlynova-network

nova-api 23 (3392) 35 (4769) 23 (8600) 11 (5254) 11 (4062)

nova-compute 18 (1316) 18 (1430) 18 (3782) 1 (21) 306 (67874)

nova-network 31 (1816) 45 (1018) 32 (2159) n/a 14 (1786)

nova-scheduler

23 (2392) 12 (2976) 12 (7388) 12 (9737) 7 (11567)

nova-conductor

n/a n/a n/a n/a 317 (82717)

quantum-server

n/a n/a n/a 36 (4498) n/a

quantum-dhcp n/a n/a n/a 4 (84) n/a

S. Baset, CQ Tang, B. Tak, L. Wang 34

Page 35: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 35

Create a VM: Message exchange with RabbitMQ – recv()

Diablo Essex Folsomnova-network

Folsom-quantum

Grizzlynova-network

nova-api 16 (833) 25 (1609) 16 (833) 7 (328) 7 (328)

nova-compute 14 (3442) 14 (2369) 14 (8752) 1 (9479) 230 (94463)

nova-network 18 (1808) 26 (3045) 19 (7298) n/a 8 (2699)

nova-scheduler

8 (2479) 8 (2918) 8 (5307) 8 (5345) 4 (3861)

nova-conductor

n/a n/a n/a n/a 172 (58721)

quantum-server

n/a n/a n/a 24 (396) n/a

quantum-dhcp n/a n/a n/a 4 (3726) n/a

S. Baset, CQ Tang, B. Tak, L. Wang 35

Page 36: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang 36

2176 comp 172 cond1667 gapi 139 greg 3 keys5429 napi 12 netw 4 sche

308 comp317 cond 17 gapi 9 greg 3 keys 19 napi 19 netw 7 sche

Create a VM: send() and recv() grizzly-nova net

send() recv()

Single byte recv in webob library

Page 37: Dissecting Open Source Cloud Evolution: An OpenStack Case Study

S. Baset, CQ Tang, B. Tak, L. Wang

Conclusions

• Complexity is brewing under OpenStack. Beware!• Build distributed applications with tracing in mind• Flow diff

– Through an interactive page• Ongoing and future work

– Fault injection and log correlation– Leverage tool for other projects, e.g., CloudFoundry

37