geo-distribution 唐宇 2013 年 11 月. outline geo-distribution smfs racs

35
SDP-MARCH- Talk Geo-Distribution 唐唐 2013 唐 11 唐

Upload: samuel-adams

Post on 17-Dec-2015

242 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

SDP-MARCH-Talk

Geo-Distribution

唐宇2013年 11月

Page 2: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Outline

• Geo-distribution• SMFS• RACS

Page 3: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Why geo-distribution?

• Securing data from large-scale disasters is important.– 40% of enterprises that experience a

disaster (e.g. loss of a site) go out of business within five years.

– Data loss failure in a large bank can have much greater consequences with potentially global implications.

Page 4: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Open questions

• Trade-off involves balancing safety against performance– Synchronous• Sensitive to link latency

– Semi-synchronous• Data can still be lost if disaster strikes

– Fully asynchronous• Weakest safety guarantees

Page 5: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Outline

• Geo-distribution• SMFS• RACS

Page 6: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

SMFS ( Smoke and Mirrors File System)

• Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance – FAST’09– Hakim Weatherspoon, Lakshmi Ganesh,

Tudor Marian, Mahesh Balakrishnan, and Ken Birman

– Cornell University & Microsoft Research (Silicon Valley)

Page 7: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

论文工作• 在 geo-distribution三类常见实现方法外,提出新的方法 network-sync:– Offer stronger guarantees on data reliability

than semi-synchronous and asynchronous solutions while retaining their performance

• 支持多个文件更新的原子性(略)

Page 8: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Data Loss Model• We consider data to be lost if an update has

been acknowledged to the client, but the corresponding data no longer exists in the system.– Synchronous

• When Primary and mirror sites fail.

– Semi-synchronous• When the primary site fails and sent packets do not make

it to the mirror.

– Asynchronous• When the primary site fails

Page 9: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Failure Model and Assumptions

• Failures can occur at any level– Storage devices, storage area network, network links,

switches, hubs, wide-area network, and/or an entire site

• Failures can be simultaneously or even in sequence

• Sites may have redundant network paths connecting them– Allow us to focus on the tolerance of failures that

disable an entire site, and on combinations of failures such as the loss of both an entire site and the network connecting it to the backup (rolling disaster)

Page 10: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Network-sync大致原理1. It proactively adds redundancy at the

network level to transmitted data.2. It exposes the level of in-network

redundancy added for any sent data via feedback notifications

Page 11: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Network-sync具体实现

Page 12: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Maelstrom• Forward Error Correction(FEC)

– A generic term for a broad collection of techniques aimed at proactively recovering from packet loss or corruption.

– FEC implementations for data generated in real-time are typically parameterized by a rate (r, c): for every r data packets, c error correction packets are introduced into the stream.

• Maelstrom是 FEC的一种实现– Its performance is tolerant to random and bursty loss– 基于 TCP协议

• 若Maelstrom也不能修复错误数据,就只能重传报文

Page 13: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Comparison of Mirroring Protocols

• Network-sync can be understood as an enhancement of the semi-synchronous style of mirroring

• Offering similar performance as semi-synchronous solutions, but with increased reliability

Page 14: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Evaluation Configuration

• Local-sync (semi-synchronous)• Remote-sync (synchronous)• Network-sync• Local-sync+FEC• Remote-sync+FEC

Page 15: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Reliability During Disaster(一 )

The remote-sync and remote-sync+FEC solutions do not lose data in this situation

The y-axis shows both the total number of messages sent and total number of messages lost

Page 16: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Reliability During Disaster (二 )

Latency is the time between a local storage server sending a request and a remote storage server receiving the request.

Page 17: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Performance evaluation(一 )The x-axis shows loss probability on the wide-area link being increased from 0% to 1%, while the y-axis shows the throughput achieved by each of these mirroring solutions.

Page 18: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Performance evaluation(二 )

Page 19: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Outline

• Geo-distribution• SMFS• RACS

Page 20: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

RACS

• RACS: A Case for Cloud Storage Diversity– SoCC’10– Hussam Abu-Libdeh, Lonnie Princehouse,

Hakim Weatherspoon– Cornell University

Page 21: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

应用场景• The increasing popularity of cloud storage is

leading organizations to consider moving data out of their own data centers and into the cloud

• It becomes very expensive for organizations to switch storage providers.

• We argue that striping user data across multiple providers can allow customers to avoid vendor lock-in, reduce the cost of switching providers, and better tolerate provider outages or failures.

Page 22: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

论文工作• RACS (Redundant Array of Cloud Storage)– A cloud storage proxy that transparently stripes

data across multiple cloud storage providers

• 论文贡献– 通过仿真证明 RACS能用可接受的额外代价应对部分数据的不可用,以及减少对存储服务商的依赖

– 通过仿真展示切换云存储服务商的 cost– 证明 RACS可以与 Amazon S3 clients兼容且使用多个存储提供者作为后端

Page 23: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Solution

• 如何减少对数据存储服务提供者的依赖?• 数据镜像 /副本?–额外开销太大

• 解决方案–类似 RAID 5– Erasure coding

Page 24: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

RAID 5

• 假设有 n个磁盘,将要写入的数据均分为 (n-1)块,存放到 (n-1)块磁盘中

• 对 (n-1)块中存放的数据按位获取奇偶校验信息,存放到第 n块磁盘中

• 当 1个磁盘坏掉,可以通过其他 (n-1)块磁盘恢复失去的数据

Page 25: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Erasure coding

• Erasure coding– It transforms a message of k symbols into a

longer message with n symbols such that the original message can be recovered from a subset of the n symbols

• 在RACS的实现– 假设有 n个磁盘,将要写入的数据均分为m块,存放到

m块磁盘中– 针对上一步中m块磁盘中存放的数据,作数据冗余(特殊处理的),并存放到其余 (n-m)块磁盘中

– 当一个磁盘的数据丢失,可以通过其他任意m块磁盘恢复失去的数据

Page 26: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Erasure coding的优点• Tolerating Outages• Tolerating Data Loss• Adapting to Price Changes• Adapting to New Providers• Control Monetary Spending• Choice in Data Recovery

Page 27: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Distributed RACS

• To avoid from bottleneck• Zookeeper – Chubby的开源实现

Page 28: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Performance overhead• Storage

– RACS uses a factor of n/m more storage, plus some additional overhead for metadata associated with each share

• Number of requests– n for put , create , and delete operations– m for get operations

• Bandwidth– The bandwidth used by put operations by a factor of n/m, due

to the redundant shares

• Latency– Put operations must wait for the slowest of the repositories– Get latency could be better than the average of all repositories– Coordination with ZooKeeper is another source of latency

Page 29: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

实验数据集介绍• Data covers 18 months of activity on the

Internet Archive (IA, http://www.archive.org) servers.

• The trace represents HTTP and FTP interactions to read and write various documents and media files (images, sounds, videos) stored at the Internet Archive and served to users.

Page 30: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

实验数据集

Page 31: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Money cost

Page 32: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Monthly costs breakdown

Page 33: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Other results

Tolerating a vendor price hike

The cost of switching the Internet Archive’s storage provider

All response times averaged over four runs

Page 34: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

FUTURE WORK1. 未考虑存储服务提供者之间的关联– The virtual compute nodes of Amazon EC2

can read from and write to Amazon S3 storage with low latency and no bandwidth charges

2. 存储服务提供者的异构性– Cloud providers– Cluster– Desktop PC

Page 35: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS

Thanks!Q&A