anatomy of an action

Post on 16-Aug-2015

47 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mining the event storm

Vladik RomanovskyEngineer

The Anatomy of an Action

EngineerGordon Chung

OpenStack is a wonderful place

when you use OpenStack you might see this

WTF???

if you’re lucky, you might find the real error![instance: e7933ceb-d1e7-42fe-9f37-d275ebd375bd] Instance failed to spawn

Traceback (most recent call last):......ProcessExecutionError: Unexpected error while running command.Command: qemu-img convert -O raw /opt/stack/data/nova/instances/_base/7434c85f2968d2cfb05b07d8c769d7d938cec5e8.part /opt/stack/data/nova/instances/_base/7434c85f2968d2cfb05b07d8c769d7d938cec5e8.convertedExit code: 1Stdout: u''Stderr: u'qemu-img: error while reading sector 0: Input/output error\n'

Debugging be Hard

• actions consists of multiple steps• asynchronous calls that can cause

timing issues• distributed nature of OpenStack

can make it difficult to debug• parsing log files are easy -- if you’re

a robot

Use Case: Creating an Instance

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

FAIL HERE

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

FAIL HERE

Creating an Instance

conductor scheduler computemanager

buildnetwork

buildstorage

startguestapi

FAIL HERE

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

notification bus

Creating an Instance

conductor scheduler computemanager

buildnetwork

buildstorage

startguestapi

FAIL HERE

notification bus

OpenStack Events

• most services emit notifications for some discrete events• the content of notification represent that state of the

environment, resource, etc… at the point in time• notifications are defined by a type to describe content• nova: compute.instance.create.*, scheduler.create_volume• neutron: port.create.*, network.create.*• cinder: volume.detach.*, volume.create.*• keystone: identity.user.*, identity.project.*• and a lot more...

Creating an Instance

api conductor scheduler hostmanager

buildnetwork

buildstorage

startguest

notification bus

consumer?

Ceilometer

• telemetry project in OpenStack• notification agent which consumes messages• listens to the queues of each OpenStack service• picks specific measurement values from notifications and

builds meters

but wait, there’s more!

every notification is also captured as an Event

Creating an Instance

api conductor scheduler hostmanager

buildnetwork

buildstorage

startguest

notification bus

ceilometer notification agentMeters Events

Ceilometer Events

• initially implemented in Icehouse (part of StackTach integration)

• an Event represents the state of an object in an OpenStack service at a point in time.

• built from INFO and ERROR level notifications emitted by all services

• ability to normalise messages by mapping key attributes from notification messages to a common name

Ceilometer Event Model

• message id• event type• timestamp• traits

• queryable, indexed attributes

• ie. payload.x.y.z => attr1• raw

• full notification

Ceilometer Event Processing

• all events are forced through pipelines

• events can be published to multiple targets• database• file• queue• http

Benefits of Centralised Events

• potential lost of data if logging locally• normalisation of data• event flow across services gives context

• individual events means nothing• end to end flow means something

connecting the dots…

Debugging be Easier

• we wanted a view to show all the events of a given action by a resource

• be able to see any errors• temporally aware -- order of events• show the flow and context of events

postmortem analysis using Elasticsearch

ElasticSearch

• document-oriented, schema free database• built on top of Apache Lucene

• focused on providing full-text search capabilities• distributed, highly available, real time db• kibana - gui interface to database

KIBANA!!!

KIBANA!!!

HORIZON!!!

HORIZON!!!

Extending Events

• there is a lot of data that isn’t published• the data that is published is disorganised• extending support in horizon

• drilling down into event to view full raw data• filter options - time range, events for a specific request

• ceilometer• alarm on events• build metrics from events

thank you

BACKUP

Horizon Events Prototype, by George Peristerakis

https://github.com/enovance/horizon/tree/event-prototype

top related