differentiated storage services

21
1 Differentiated Storage Services Michael Mesnier, Jason Akers, Feng Chen Intel Corporation Tian Luo The Ohio State University 23rd ACM Symposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais, Portugal

Upload: berne

Post on 10-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Differentiated Storage Services. Tian Luo The Ohio State University. Michael Mesnier, Jason Akers, Feng Chen Intel Corporation. 23rd ACM Symposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais , Portugal . Technology overview. An analogy: moving & shipping. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Differentiated Storage Services

1

Differentiated Storage Services

Michael Mesnier, Jason Akers, Feng ChenIntel Corporation

Tian LuoThe Ohio State University

23rd ACM Symposium on Operating Systems Principles (SOSP)

October 23-26, 2011, Cascais, Portugal

Page 2: Differentiated Storage Services

2

An analogy: moving & shipping

Why should computer storage be any different?

Technology overview

Classification Policy assignment Policy enforcement

Page 3: Differentiated Storage Services

3

Differentiated Storage Services

(offline)

Classifier QoS Policy

Metadata Low latency

Boot files Low latency

Small files High throughput

Media files High bandwidth

… …

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

StoragePool A

StoragePool B

StoragePool C

= Current & future research

Technology overview

Classification Policy assignment Policy enforcement

Classify each I/O in-band

Page 4: Differentiated Storage Services

4

The SCSI CDB

5 bits 32 classes

Page 6: Differentiated Storage Services

6

Filesystem prototypes (Ext3 & NTFS)

Classify each I/O in-band

Classifier Cache priority

Metadata 0

Journal 0

Directories 0

Files <= 4KB 1

Files <=16KB 2

Files <=64KB 3

… …Files > GB Lowest

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

= Current & future research

Technology overview

FS classification FS policy assignment FS policy enforcement

Disk SSD

Page 7: Differentiated Storage Services

7

Classifier Cache priority

System tables 0Temp. tables (on write) 1

Randomly tables 2Temp. tables (on read) 3

Sequential tables BypassIndex files Bypass

Database prototype (PostgreSQL)

Classify each I/O in-band

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

= Current & future research

Technology overview

DB classification DB policy assignment DB policy enforcement

Disk SSD

Page 8: Differentiated Storage Services

8

Selective cache algorithms Selective allocation

– Always allocate high-priority classes– E.g. FS metadata and DB system tables always allocated

– Conditionally allocate low-priority classes– Depends on cache pressure, cache contents, etc.– High/low cutoff is a tunable parameter

Selective eviction– Evict in priority order (lowest priority first)

– E.g., temporary DB tables evicted system tables– Trivially implemented by managing one LRU per class

Technology overview

Page 9: Differentiated Storage Services

9

Technology development

Page 10: Differentiated Storage Services

10

Ext3 prototype OS changes (block layer)

– Add classifier to I/O requests– Only coalesce like-class requests– Copy classifier into SCSI CDB

Ext3 changes– 18 classes identified – Optimized for a file server

Small files & metadata A small kernel patch A one-time change to the FS

Ext3 Class

Group Number

Cache priority

Unclassified 0 12Superblock 1 0Group desc. 2 0

Bitmap 3 0Inode 4 0

Indirect block 5 0Directories 6 0

Journal 7 0File <= 4KB 8 1

File <= 16KB 9 2File <= 64KB 10 3

… … …File > 1GB 18 11

Technology development

Page 11: Differentiated Storage Services

11

Ext3 classification illustrated echo ‘Hello, world!’ >> foo; sync

– READ_10(lba 231495 len 8 grp 9) <=4KB– WRITE_10(lba 231495 len 8 grp 9) <=4KB– WRITE_10(lba 16519223 len 8 grp 8) Journal– WRITE_10(lba 16519231 len 8 grp 8) Journal– WRITE_10(lba 16519239 len 8 grp 8) Journal– WRITE_10(lba 16519247 len 8 grp 8) Journal– WRITE_10(lba 8279 len 8 grp 5) Inode

7 I/Os (28KB) to write 13 bytes– Metadata accounts for most of the overhead

I/O classification shows read-modify-write and

metadata updates

Technology development

NTFS classification is implementedwith Windows filter drivers

Page 12: Differentiated Storage Services

12

PostgreSQL prototype Classification API: scatter/gather I/O

OS changes (block layer)– Add O_CLASSIFIED file flag– Extract classifier from SG I/O

A small OS & DB patch A one-time change to the OS & DB

PostgreSQL class

Group Number

Unclassified 0Transaction log 19System table 20

Free space map 21Temporary table 22Random table 23

Sequential table 24Index file 25Reserved 26-31

fd=open("foo", O_RDWR|O_CLASSIFIED, 0666); class = 19;myiov[0].iov_base = &class;myiov[0].iov_len = 1;myiov[1].iov_base = “Hello, world!”;myiov[1].iov_len = 13;writev(fd, myiov, 2);

Preliminary DB classes

Technology development

Page 13: Differentiated Storage Services

13

Cache implementations Fully associative read/write LRU cache

– Insert(), Lookup(), Delete(), etc.– Hash table maps disk LBA to SSD LBA– Syncer daemon asynchronously cleans cache

Monitors cache pressure for selective allocateMaintains multiple LRU lists for selective evict

Front-ends: iSCSI (OS independent) and Linux MD MD cache module (RAID-9)

Technology development

Striping: mdadm –create /dev/md0 –level=0 –raid-devices=2 /dev/sdd /dev/sdeMirroring: mdadm –create /dev/md0 –level=1 –raid-devices=2 /dev/sdd /dev/sde RAID-9: mdadm –create /dev/md0 –level=9 –raid-devices=2 <cache> <base

Page 14: Differentiated Storage Services

14

Evaluation

Page 15: Differentiated Storage Services

15

Experimental setup Host OS (Xeon, 2-way, quad-core, 12GB RAM)

– Linux 2.6.34 (patched as described) Target storage system

– HW RAID array + X25-E cache Workloads and cache sizes

– SPECsfs: 18GB (10% of 184GB working set)– TPC-H: 8GB (28% of 29GB working set)

Comparison– LRU versus LRU-S (LRU with selective caching)

Evaluation

Page 16: Differentiated Storage Services

16

SPECsfs I/O breakdown

Large files pollute LRU cache(metadata and small files evicted)

LRU

LRU-S fences off large file I/O

LRU-S

Page 17: Differentiated Storage Services

17

SPECsfs performance metrics

Syncer overhead

LRU-SLRU

LRU LRU-S

I/O Throughput

LRU LRU-S

Hit rate

LRU LRU-SHDD

Running time

1.8x speedup

Page 18: Differentiated Storage Services

18

SPECsfs file latencies

LRULRU-S

Reduction in write latency over HDD

LRU suffers from write outliers(from eviction overheads)

LRULRU-S

Reduction in read latency over HDD

LRU-S reduces read latency(most small files are cached)

LRULRU-S

Page 19: Differentiated Storage Services

19

TPC-H I/O breakdown

Indexes pollute LRU cache(user tables evicted)

LRU

LRU-S fences off index files

LRU-S

Page 20: Differentiated Storage Services

20

TPC-H performance metrics

Syncer overhead I/O Throughput

LRU-SLRU

LRU LRU

LRU

LRU-S LRU-S

LRU-S

HDD

Running timeHit rate

1.2x speedup

Page 21: Differentiated Storage Services

Intel Confidential

21

Conclusion & future work Intelligent caching is just the beginning

– Other types of performance differentiation– Security, reliability, retention, …

Other applications we’re looking at – Databases– Hypervisors– Cloud storage– Big Data (NoSQL DB)

Work already underway in T10 Open source coming soon…

Thank you!

Questions?