Transcript
Page 1: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

High Availability

Page 2: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Mission Statement

1. high availability business-level cloud data store2. federated clouds = diversification3. many DCs and/or cloud providers

4. we care mostly about performance = high availability

5. practical solutions are needed

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 2/21...

2/21

Page 3: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore : The Short Story

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 3/21...

3/21

Page 4: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore: One DC is Not Enough

• rememberJune 2013?• most services today use vertical intergration -- no diversity

• Hitachi does not share DCs with NEC

• regional diversity of one provider is bad◦ how many Amazon DCs in Japan?

.(the only possible) Solution..

.

... is to sign contracts with multiple DCs and manage on

client side

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 4/21...

4/21

Page 5: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore: One DC is Not Enough

Kansai

DC1

Okinawa Locations

Data Centers

DC2

Kyushu

Osaka Office DC1

DC1 DC2 Naha Office

Network distance

Network distance

storage network

Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 ….

DC1 DC2 Business trip

Store APIs

Proposed Software

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 5/21...

5/21

Page 6: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore: Store Diversification• store = sum ofmultiple substores• in software: not a priority list -- optimization engine!• realtime performance monitoring, read/write optimization, etc.

• sub-file data unit -- chunks

SSD Growing network

distance User

HDD DC1 DC2 …

Network

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 6/21...

6/21

Page 7: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore: Socially Aware Store• content relevance based on

social graph• relevance is a distribution• individual redundancy based on distribution

• other link types: same time, location,filetype, ...

• link strength != 1Descending

order

Relevance

Distribution

Redundancy (user setting)

Physical limit of redundancy

End of content

There is a link

When a file is …

Between Created Viewed Edited Deleted

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 7/21...

7/21

Page 8: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

hsStore: Software Design

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 8/21...

8/21

Page 9: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: Specs

• many substores, heterogeneous e2e performance and capacity• each substore has its own API (Dropbox, GDrive, SSD, etc.), but haStore exports a

generic API• data unit: sub-file blobs, for now fixed 100kb size

• social graph is used to define priority lists of files◦ different for each user

• optimization is key element of software engines

1. sync logic2. redundancy logic

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 9/21...

9/21

Page 10: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: API Stack

• Generic API starts fromLevel 2, similar to drivers

• the stack is implemented by each client = each user

Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 …. Store

Proposed Software

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 10/21...

10/21

Page 11: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: Sync Engine• optimization for throughput minimization• same logic for SSD, HDD and over-the-network

haStore

Storage SyncEngine

Optimization

LocalCache

Check1 2

Use

GUI,Clients

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 11/21...

11/21

Page 12: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: Sync Engine Logic

Bulk

Thro

ughp

ut History Data

Increase timeout

PerformanceTradeoff

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 12/21...

12/21

Page 13: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: Redundancy Logic (1)

Descending order

Relevance

Distribution

Redundancy (user setting)

Physical limit of redundancy

End of content

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 13/21...

13/21

Page 14: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Design: Redundancy Logic (2)

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 14/21...

14/21

Page 15: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

haStore: Social Graph

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 15/21...

15/21

Page 16: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Social Graph : Basics• current version: only simple types of links

• no link strength

There is a link

When a file is …

Between Created Viewed Edited Deleted

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 16/21...

16/21

Page 17: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Social Graph : Advanced

• community detection

• files that could be linked:

1. touched at roughly the same time2. touched by the same user3. same location, filetype, size, etc.

• link strength, different for each kind of relation, variable e2e cost onpaths

• discovery based on e2e cost, not hop count

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 17/21...

17/21

Page 18: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Implementation, Tests

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 18/21...

18/21

Page 19: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Performance : Demo

A-san B-san

DBX GDR

2014-01-22 12:13:30 Block DONEBlock UPLOADBlock DOWNLOAD

• also demo

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 19/21...

19/21

Page 20: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

Wrapup

• haStore: high availability cloud store

• main features

◦ throughput-aware sync/redundancy optimization◦ sub-file blocks, smart distribution

◦ social graph• current status: v1.0 in operation, v2.0 on the way

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 20/21...

20/21

Page 21: High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

.

That’s all, thank you ...

Marat Zhanikeev -- [email protected] High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 21/21...

21/21


Top Related