is it time to go global with cloud performance management?

Post on 12-May-2015

216 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The topic of Federated Clouds has been in discussion for several years. However, practice today sees very little federation across large infrastructure providers. One of the biggest causes of this loitering is insufficient understanding of how to share responsibility across data centers, providers, and so on. This study shows that understanding cloud performance at such a large scale is a crucial part of information support in federated clouds. Topics like cloud performance measurement and modeling, as well as several practical ongoing projects and works in progress are also discussed.

TRANSCRIPT

.

Mission Statement

1. federated clouds = diversification2. many DCs and/or cloud providers

3. we care mostly about performance4. practical solutions are needed

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 2/30...

2/30

.

Example: BizStore

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 3/30...

3/30

.

BizStore: One DC is Not Enough

• rememberJune 2013?• most services today use vertical intergration -- no diversity

• Hitachi does not share DCs with NEC

• regional diversity of one provider is bad◦ how many Amazon DCs in Japan?

.(the only possible) Solution..

.

... is to sign contracts with multiple DCs and manage on

client side◦ to be officially presented/released in April 01

01 myself+0 "High Availability Cloud Storage ... Social Graph ... Smart Distribution" NS研 (April 2014)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 4/30...

4/30

.

BizStore: One DC is Not Enough

Kansai

DC1

Okinawa Locations

Data Centers

DC2

Kyushu

Osaka Office DC1

DC1 DC2 Naha Office

Network distance

Network distance

storage network

Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 ….

DC1 DC2 Business trip

Store APIs

Proposed Software

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 5/30...

5/30

.

BizStore: Store Diversification

• in software: not a priority list -- optimization engine!• realtime performance monitoring, read/write optimization, etc.

• sub-file data unit -- chunks

SSD Growing network

distance User

HDD DC1 DC2 …

Network

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 6/30...

6/30

.

BizStore: Socially Aware Store• content relevance based on

social graph• relevance is a distribution• individual redundancy based on distribution

• other link types: same time, location,filetype, ...

• link strengh != 1Descending

order

Relevance

Distribution

Redundancy (user setting)

Physical limit of redundancy

End of content

There is a link

When a file is …

Between Created Viewed Edited Deleted

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 7/30...

7/30

.

Example: Cloud Streaming

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 8/30...

8/30

.

Cloud Streaming: Fixing Problems

Traditionalstreaming

P2Pstreaming

Cloud streaming

Adaptivestreaming

• Congestion(Flash Crowds)

• Unreliable throughput

• Unreliable sources

• Unreliable throughput

• Congestion

Fixed Fixed

Fixed

Fixed

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 9/30...

9/30

.

Cloud Streaming: Design

VMpopulation

CurrentSources

ServiceProvider

(SP)

Tracker

ServiceProvider

(SP)

Parentpeers

P2Pstreaming

Cloudstreaming

ClientClient

02 myself+0 "Multi-Source Stream Aggregation in the Cloud" Wiley Book on ACDN, Chapter 10 (2014)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 10/30...

10/30

.

Practical Solutions for FederatedClouds

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 11/30...

11/30

.

A Shortlist of (S)olutions

1. S1: Nextgen traffic processors at DCs

2. S2: QoS Context and Performance Visualization at DCs

3. S3: Performance Modeling for Federated Clouds

4. S4: Client Side Traffic Boostings5. .... definitely not a complete list

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 12/30...

12/30

.

Solution (S) 1:Nextgen Traffic Processors at DCs(work in progress)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 13/30...

13/30

.

S1: Multicore Packet Capture

Global Networks

Data Center Internals

Gateway Switch

Capture Manager

CPU CPU CPU

CPU CPU CPU

… Storage

Mirror

• multicore is the key

• multicore !=traditional parallel processing03

• on-demand capture, DPI,heterogeneous tasks 04

03 myself+0 "...Multicore Capture in Data Center Forensics" ACM AISACCS-SFCS (June 2014)

04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 14/30...

14/30

.

S1: Multicore Hates Memory Locks• lockfree design 04 : no messages, no memory locks

PF_RING

PF_RING

TimeManager

Shared Memory

Capture

Capture

Core 1

Core 2

Core 3

….

Core X Manager

PF_ RING

Shared memory

One thread

Create

Fork

Lifespan Stale check

Process/wrap Wrap wait

Doub

le-L

inke

d Li

st (D

LL)

Assign

04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 15/30...

15/30

.

Solution (S) 2: DC Performance APIs

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 16/30...

16/30

.

S2: E2E QoS, M2M Patterns

Meter Merger

Per flow statistics

Analyzer

History,state

Profiler

UDP

Users

Clients

Probe Analysis machine

Web application

• clean slate: captureQoS context 05

• visualize usercommunities

• export via APIs to usersand/or service providers

05 myself+0 "A holistic community-based architecture for measuring E2E QoS at data centres" IJCSE (in print)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 17/30...

17/30

.

S2: Has to be a Clean Slate

ProbeRouter

Data center infrastructure

source IP

timestamp Key Key DLL0 #01 #02 #03 1 #042 #05 #06….….2^24 #07

source portdest IP

dest port

protocolpacket size

CRC24

Packet Hash table

#01

DLL#05#04

#02#03#07

Exportover UDP

Byte048

12162024…

0 (bits) 32

Source port Dest port

Source IPDestination IP

* psize pspace

Start time (s)Start time (us)

* psize pspace1 11

Data unit

psize:Packet size

pspace:Packet space

(us)

#06

Exportvia

a file

UDP RX

Buffer (5s)

Byte048

12162024…

0 (bits) 32

Source port Dest port

Source IPDestination IP

D psize pspace

Start time (s)Start time (us)

D psize pspace1 11

Data unit

D:Direction(0 or 1)

Merger

Find flow fromopposite direction

Analyzer

History

State

Read and update

Ring buffer of data units per IP on internal networks

Statistic MeaningMinOWD Global minimum OWDMaxBatch Max byte count of a

packet burstBulks Throughputs in flows

Per source-dest pair

• has to be a clean slate!• cisco, ntop, sflow are notfeasible

• QoS context is somethingnew

• (figure is vector, so, zoom in!)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 18/30...

18/30

.

S2: But Payoff is Great!

0 6400 12800 19200Batch size (bytes)

0

800

1600

2400

3200

4000O

WD

(ms)

+ T

X tim

e (x

0.1m

s)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 19/30...

19/30

.

Solution (S) 3: Cloud Weather System(work in progress)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 20/30...

20/30

.

S3: Cloud Weather System

(high/low)Pressure

front

Typhoon

Drought

Goodweather

Badweather

• continents: user, services 07

• water: network• weather, clouds, etc.: changes inperformance

• droughts: insufficiency ofinfrastructure, users do not get enoughcapacity

• typhoons: basically, Flash Crowds inservices, going viral, ...

• forecasting: possible with enoughperformance monitoring, similar to stockmarket

07 myself+0 "Cloud Weather System as a Futuristic Performance Model" IEICE総合大会 (March 2013)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 21/30...

21/30

.

Solution (S) 4: Mobile ThroughputBoosters

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 22/30...

22/30

.

S4: Mobile Throughput Booster

• so far, only possible in wireless -- WiFi Direct

Single Connection Multipath

Singular Connectivity

Traditional Applications

Traditional Multipath

Multiple Connectivity

No known cases (wasted potential)

Group Communication 3G/LTE/* + WiFi Direct THIS PROPOSAL

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 23/30...

23/30

.

S4: Group Resource Pooling

Remote connectivity Local Connectivity

Content Provider

Main Client

Delegated Client

Delegated Client

3G/LTE/* Access 3G/LTE/*

Access

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 24/30...

24/30

.

S4: Converged Wireless Campus

Student

Develop, make secure

APP + CODE

Campus

Another Student

APP + CODE

APP + CODE

1

2 2 Distribute

3 Meet and delegate

API Tokens

API Tokens

Distribute

Pass at delegation

University 4

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 25/30...

25/30

.

Solution (S) 5: Over-the-NetworkIndexing

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 26/30...

26/30

.

S5: Indexing in Clouds

Data

Indexer

Index

Network

Traditional Client

Data

Indexer

Index Read, Write

Stringex Client

The

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 27/30...

27/30

.

S5: Over-the-Network Optimization• in short: throughput-centric network storage optimization 08

Stringex

Index

Stringex Client

The

Sync Engine

Optimization

Local Cache

Check 1 2

Use

08 myself+0 "A New Practical Design for Browsable Over-the-Network Indexing" ISEEE (April 2014)

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 28/30...

28/30

.

S5: Performance

3.15 3.85 4.55 5.25 5.95 6.65Index Size (log)

2.55

2.65

2.75

2.85

2.95

3.05

3.15

3.25

Thro

ughp

ut (l

og o

f byt

es/d

oc)

Lucene

Stringex

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 29/30...

29/30

.

That’s all, thank you ...

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30...

30/30

.

[01] myself+0 (April 2014)High Availability Cloud Storage ... Social Graph ... Smart DistributionNS研

[02] myself+0 (2014)Multi-Source Stream Aggregation in the CloudWiley Book on ACDN, Chapter 10

[03] myself+0 (June 2014)...Multicore Capture in Data Center ForensicsACM AISACCS-SFCS

[04] myself+0 (in print)A Lock-Free Shared Memory Design for ... Multicore Packet Traffic CaptureIJNM

[05] myself+0 (in print)A holistic community-based architecture for measuring E2E QoS at data centresIJCSE

[06] myself+0 (May 2014)Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30

...

30/30

.

Towards a Practical Method for Interactive Traffic Visualizations in Data CentersSC研

[07] myself+0 (March 2013)Cloud Weather System as a Futuristic Performance ModelIEICE総合大会

[08] myself+0 (April 2014)A New Practical Design for Browsable Over-the-Network IndexingISEEE

Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30...

30/30

top related