larger-than-memory data management on modern storage ... · •apache geode – overflow tables...

Post on 26-Jul-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Larger-than-Memory Data Managementon Modern Storage Hardware

for In-Memory OLTP Database SystemsLin Ma, Joy Arulraj, Sam Zhao, Andrew Pavlo, Subramanya R. Dulloor,

Michael J. Giardino, Jeff Parkhurst, Jason L. Gardner, Kshitij Doshi,Col. Stanley Zdonik

2

MOTIVATION

•Allow an in-memory DBMS to store/access data on disk without bringing back all the slow parts of a disk-oriented DBMS.

•Different properties of storage devices may affect important design decisions.

3

STORAGE TECHNOLOGIES

• 10m Tuples – 1KB each• Synchronization Enabled

HDD SMR SSD 3D XPoint NVRAM DRAM0.01

1100

100001000000

1KB Read 1KB Write 64KB Read 64KB Write

Late

ncy

(nan

osec

)

random sequential

4

DESIGN DECISIONS

• Hardware independent policies–Cold Tuple Identification–Evicted Tuple Meta-data

• Hardware dependent policies–Cold Tuple Retrieval–Merging Threshold–Access Methods

5

HARDWARE INDEPENDENT POLICIES

6

INDEPENDENT POLICIES

•Cold Tuple Identification–Option #1: On-line identification–Option #2: Off-line identification

● Evicted Tuple Meta-data–Option #1: Marker to represent the on-disk position–Option #2: Bloom filter–Option #3: Rely on virtual paging

7

EVICTED TUPLE META-DATAIn-Memory Table Heap

Tuple #01

Tuple #03

Tuple #04

Tuple #00

Tuple #02

Cold-Data Storage

header

Tuple #01

Tuple #03

Tuple #04

In-Memory Index

Access FrequencyTuple #00Tuple #01Tuple #02Tuple #03Tuple #04Tuple #05

<Block,Offset>

<Block,Offset>

<Block,Offset>

IndexBloom Filter

8

HARDWARE DEPENDENT POLICIES

9

Transaction

COLD TUPLE RETRIEVALIn-Memory Table Heap

Tuple #00

Tuple #02

Cold-Data Storage

header

Tuple #03

Tuple #04

● Option #1:Abort-and-Restart

Tuple #01Tuple #01

Tuple #00Read:

?Read: Tuple #01Read:

Tuple #02Read:

Abort

10

Transaction

COLD TUPLE RETRIEVALIn-Memory Table Heap

Tuple #00

Tuple #02

Cold-Data Storage

header

Tuple #03

Tuple #04

● Option #2:Synchronous Retrieval

Tuple #01Tuple #01

Tuple #00Read:

?Read: Tuple #01Read:

Tuple #02Read:

Stall

11

• Option #1: Always Merge

• Option #2:Merge Only on Update

• Option #3: Selective Merge

Transaction

MERGING THRESHOLDIn-Memory Table Heap

Tuple #00

Tuple #02

Cold-Data Storage

header

Tuple #03

Tuple #04

Tuple #01

Tuple #00Read:

Tuple #01Read:

Tuple #02Read:

Sketch

12

ACCESS METHODS

• Option #1: Block-addressable–Block-level access through file system

• Option #2: Byte-addressable (NVRAM)–Use mmap through a filesystem designed for byte-addressable

NVRAM (PMFS)–Directly operate on NVRAM-resident data as if it existed in

DRAM

13

EVALUATION

• Compare design decisions in H-Store with anti-caching.

• Storage Devices:–Hard-Disk Drive (HDD)–Shingled Magnetic Recording Drive (SMR)–Solid-State Drive (SSD)–3D XPoint (3DX)–Non-volatile Memory (NVRAM)

14

COLD TUPLE RETRIEVAL

• YCSB Workload – 90% Reads / 10% Writes• 10GB Database using 1.25GB Memory

HDD SMR SDD 3DX NVM0

50000

100000

150000

200000

Abort and Restart Synchronous Retrieval

Thro

ughp

ut (t

xn/s

ec)

large blocks small blocks

15

MERGING THRESHOLD

• YCSB Workload – 90% Reads / 10% Writes• 10GB Database using 1.25GB Memory

HDD (AR) HDD (SR)0

50000

100000

150000

200000

Merge (Update-Only) Merge (Top-5%)Merge (Top-20%) Merge (All)

Thro

ughp

ut (t

xn/s

ec)

Improvement:● SMR: 30%● SSD: 17%● 3DX: 21%● NVRAM: 10%

17

CONFIGURATION COMPARISON

• Generic Configuration (2013 Anti-caching)–Abort-and-Restart Retrieval–Merge (All) Threshold–1024 KB Block Size

• Optimized Configuration–Synchronous Retrieval–Top-5% Merge Threshold–Block Sizes (HDD/SMR-1024 KB) (SSD/3DX-16 KB)–Byte-addressable access for NVRAM

18

GENERIC VS OPTIMIZED

HDD SMR SSD 3D XPoint NVRAM0

100000

200000

Generic Optimized

VOTER

Txn

/ se

c

HDD SMR SSD 3D XPoint NVRAM0

200000

400000

TATP

Txn

/ se

c

DRAM

19

CONCLUSION

• Low-latency storage devices: Smaller block sizes and synchronous retrieval

• Constraints on merge frequency improve performance

• The performance of NVRAM is as good as pure DRAM if treated correctly

20

lin.ma@cs.cmu.edu

1-844-88-CMUDB

END

21

REAL-WORLD IMPLEMENTATIONS

•H-Store – Anti-Caching•Microsoft Hekaton –Project Siberia• EPFL’s VoltDB Prototype•Apache Geode – Overflow Tables•MemSQL – Columnar Tables• SolidDB•P*TIME

22

MERGING THRESHOLD

• YCSB Workload – 90% Reads / 10% Writes• 10GB Database using 1.25GB Memory

top related