is it time to go global with cloud performance management?
DESCRIPTION
The topic of Federated Clouds has been in discussion for several years. However, practice today sees very little federation across large infrastructure providers. One of the biggest causes of this loitering is insufficient understanding of how to share responsibility across data centers, providers, and so on. This study shows that understanding cloud performance at such a large scale is a crucial part of information support in federated clouds. Topics like cloud performance measurement and modeling, as well as several practical ongoing projects and works in progress are also discussed.TRANSCRIPT
.
Mission Statement
1. federated clouds = diversification2. many DCs and/or cloud providers
3. we care mostly about performance4. practical solutions are needed
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 2/30...
2/30
.
Example: BizStore
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 3/30...
3/30
.
BizStore: One DC is Not Enough
• rememberJune 2013?• most services today use vertical intergration -- no diversity
• Hitachi does not share DCs with NEC
• regional diversity of one provider is bad◦ how many Amazon DCs in Japan?
.(the only possible) Solution..
.
... is to sign contracts with multiple DCs and manage on
client side◦ to be officially presented/released in April 01
01 myself+0 "High Availability Cloud Storage ... Social Graph ... Smart Distribution" NS研 (April 2014)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 4/30...
4/30
.
BizStore: One DC is Not Enough
Kansai
DC1
Okinawa Locations
Data Centers
DC2
Kyushu
Osaka Office DC1
DC1 DC2 Naha Office
Network distance
Network distance
storage network
Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 ….
DC1 DC2 Business trip
Store APIs
Proposed Software
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 5/30...
5/30
.
BizStore: Store Diversification
• in software: not a priority list -- optimization engine!• realtime performance monitoring, read/write optimization, etc.
• sub-file data unit -- chunks
SSD Growing network
distance User
HDD DC1 DC2 …
Network
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 6/30...
6/30
.
BizStore: Socially Aware Store• content relevance based on
social graph• relevance is a distribution• individual redundancy based on distribution
• other link types: same time, location,filetype, ...
• link strengh != 1Descending
order
Relevance
Distribution
Redundancy (user setting)
Physical limit of redundancy
End of content
There is a link
When a file is …
Between Created Viewed Edited Deleted
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 7/30...
7/30
.
Example: Cloud Streaming
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 8/30...
8/30
.
Cloud Streaming: Fixing Problems
Traditionalstreaming
P2Pstreaming
Cloud streaming
Adaptivestreaming
• Congestion(Flash Crowds)
• Unreliable throughput
• Unreliable sources
• Unreliable throughput
• Congestion
Fixed Fixed
Fixed
Fixed
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 9/30...
9/30
.
Cloud Streaming: Design
VMpopulation
CurrentSources
ServiceProvider
(SP)
Tracker
ServiceProvider
(SP)
Parentpeers
P2Pstreaming
Cloudstreaming
ClientClient
02 myself+0 "Multi-Source Stream Aggregation in the Cloud" Wiley Book on ACDN, Chapter 10 (2014)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 10/30...
10/30
.
Practical Solutions for FederatedClouds
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 11/30...
11/30
.
A Shortlist of (S)olutions
1. S1: Nextgen traffic processors at DCs
2. S2: QoS Context and Performance Visualization at DCs
3. S3: Performance Modeling for Federated Clouds
4. S4: Client Side Traffic Boostings5. .... definitely not a complete list
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 12/30...
12/30
.
Solution (S) 1:Nextgen Traffic Processors at DCs(work in progress)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 13/30...
13/30
.
S1: Multicore Packet Capture
Global Networks
Data Center Internals
Gateway Switch
Capture Manager
CPU CPU CPU
CPU CPU CPU
… Storage
Mirror
• multicore is the key
• multicore !=traditional parallel processing03
• on-demand capture, DPI,heterogeneous tasks 04
03 myself+0 "...Multicore Capture in Data Center Forensics" ACM AISACCS-SFCS (June 2014)
04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 14/30...
14/30
.
S1: Multicore Hates Memory Locks• lockfree design 04 : no messages, no memory locks
PF_RING
PF_RING
TimeManager
Shared Memory
Capture
Capture
…
Core 1
Core 2
Core 3
….
Core X Manager
PF_ RING
Shared memory
One thread
Create
Fork
Lifespan Stale check
Process/wrap Wrap wait
Doub
le-L
inke
d Li
st (D
LL)
Assign
04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 15/30...
15/30
.
Solution (S) 2: DC Performance APIs
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 16/30...
16/30
.
S2: E2E QoS, M2M Patterns
Meter Merger
Per flow statistics
Analyzer
History,state
Profiler
UDP
Users
Clients
Probe Analysis machine
Web application
• clean slate: captureQoS context 05
• visualize usercommunities
• export via APIs to usersand/or service providers
05 myself+0 "A holistic community-based architecture for measuring E2E QoS at data centres" IJCSE (in print)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 17/30...
17/30
.
S2: Has to be a Clean Slate
ProbeRouter
Data center infrastructure
source IP
timestamp Key Key DLL0 #01 #02 #03 1 #042 #05 #06….….2^24 #07
source portdest IP
dest port
protocolpacket size
CRC24
Packet Hash table
#01
DLL#05#04
#02#03#07
Exportover UDP
Byte048
12162024…
0 (bits) 32
Source port Dest port
Source IPDestination IP
* psize pspace
Start time (s)Start time (us)
* psize pspace1 11
Data unit
psize:Packet size
pspace:Packet space
(us)
#06
Exportvia
a file
UDP RX
Buffer (5s)
Byte048
12162024…
0 (bits) 32
Source port Dest port
Source IPDestination IP
D psize pspace
Start time (s)Start time (us)
D psize pspace1 11
Data unit
D:Direction(0 or 1)
Merger
Find flow fromopposite direction
Analyzer
History
State
Read and update
Ring buffer of data units per IP on internal networks
Statistic MeaningMinOWD Global minimum OWDMaxBatch Max byte count of a
packet burstBulks Throughputs in flows
Per source-dest pair
• has to be a clean slate!• cisco, ntop, sflow are notfeasible
• QoS context is somethingnew
• (figure is vector, so, zoom in!)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 18/30...
18/30
.
S2: But Payoff is Great!
0 6400 12800 19200Batch size (bytes)
0
800
1600
2400
3200
4000O
WD
(ms)
+ T
X tim
e (x
0.1m
s)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 19/30...
19/30
.
Solution (S) 3: Cloud Weather System(work in progress)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 20/30...
20/30
.
S3: Cloud Weather System
(high/low)Pressure
front
Typhoon
Drought
Goodweather
Badweather
• continents: user, services 07
• water: network• weather, clouds, etc.: changes inperformance
• droughts: insufficiency ofinfrastructure, users do not get enoughcapacity
• typhoons: basically, Flash Crowds inservices, going viral, ...
• forecasting: possible with enoughperformance monitoring, similar to stockmarket
07 myself+0 "Cloud Weather System as a Futuristic Performance Model" IEICE総合大会 (March 2013)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 21/30...
21/30
.
Solution (S) 4: Mobile ThroughputBoosters
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 22/30...
22/30
.
S4: Mobile Throughput Booster
• so far, only possible in wireless -- WiFi Direct
Single Connection Multipath
Singular Connectivity
Traditional Applications
Traditional Multipath
Multiple Connectivity
No known cases (wasted potential)
Group Communication 3G/LTE/* + WiFi Direct THIS PROPOSAL
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 23/30...
23/30
.
S4: Group Resource Pooling
Remote connectivity Local Connectivity
Content Provider
Main Client
Delegated Client
Delegated Client
3G/LTE/* Access 3G/LTE/*
Access
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 24/30...
24/30
.
S4: Converged Wireless Campus
Student
Develop, make secure
APP + CODE
Campus
Another Student
APP + CODE
APP + CODE
1
2 2 Distribute
3 Meet and delegate
API Tokens
API Tokens
Distribute
Pass at delegation
University 4
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 25/30...
25/30
.
Solution (S) 5: Over-the-NetworkIndexing
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 26/30...
26/30
.
S5: Indexing in Clouds
Data
Indexer
Index
Network
Traditional Client
Data
Indexer
Index Read, Write
Stringex Client
The
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 27/30...
27/30
.
S5: Over-the-Network Optimization• in short: throughput-centric network storage optimization 08
Stringex
Index
Stringex Client
The
Sync Engine
Optimization
Local Cache
Check 1 2
Use
08 myself+0 "A New Practical Design for Browsable Over-the-Network Indexing" ISEEE (April 2014)
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 28/30...
28/30
.
S5: Performance
3.15 3.85 4.55 5.25 5.95 6.65Index Size (log)
2.55
2.65
2.75
2.85
2.95
3.05
3.15
3.25
Thro
ughp
ut (l
og o
f byt
es/d
oc)
Lucene
Stringex
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 29/30...
29/30
.
That’s all, thank you ...
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30...
30/30
.
[01] myself+0 (April 2014)High Availability Cloud Storage ... Social Graph ... Smart DistributionNS研
[02] myself+0 (2014)Multi-Source Stream Aggregation in the CloudWiley Book on ACDN, Chapter 10
[03] myself+0 (June 2014)...Multicore Capture in Data Center ForensicsACM AISACCS-SFCS
[04] myself+0 (in print)A Lock-Free Shared Memory Design for ... Multicore Packet Traffic CaptureIJNM
[05] myself+0 (in print)A holistic community-based architecture for measuring E2E QoS at data centresIJCSE
[06] myself+0 (May 2014)Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30
...
30/30
.
Towards a Practical Method for Interactive Traffic Visualizations in Data CentersSC研
[07] myself+0 (March 2013)Cloud Weather System as a Futuristic Performance ModelIEICE総合大会
[08] myself+0 (April 2014)A New Practical Design for Browsable Over-the-Network IndexingISEEE
Marat Zhanikeev -- [email protected] Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30...
30/30