an approach for hybrid-memory scaling columnar in-memory

25
*Bernhard Höppner, °Ahmadshah Waizy, *Hannes Rauhe * SAP SE ° Fujitsu Technology Solutions GmbH ADMS’14 in conjunction with 40 th VLDB Hangzhou, China September 1, 2014 An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases

Upload: others

Post on 15-Jan-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Approach for Hybrid-Memory Scaling Columnar In-Memory

!!!!

*Bernhard Höppner, °Ahmadshah Waizy, *Hannes Rauhe !

* SAP SE ° Fujitsu Technology Solutions GmbH

!!!!

ADMS’14 in conjunction with 40th VLDBHangzhou, China

September 1, 2014

An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases

Page 2: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

2

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

Page 3: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

3

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

DRAM HDD

Page 4: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

4

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

DRAM HDD

SSD

Page 5: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

5

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

DRAM HDD

SSD Storage Class Memory ( )

Cheaper than DRAM

Denser than DRAMBandwidth close to DRAM

Latency close to DRAM

+Non-Volatile +Byte-addressable

+More energy efficient than DRAMSCM

Page 6: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

6

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

DRAM HDD

SSD Storage Class Memory ( )SCM

Available 2018+

Cheaper than DRAM

Denser than DRAMBandwidth close to DRAM

Latency close to DRAM

+Non-Volatile +Byte-addressable

+More energy efficient than DRAM

Page 7: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Memory – Linking Desire and Reality

7

LatencyBandwidth

Capacity

Cost

nsμs

GB

TB

10 $/GB

$/GB

GB/s MB/s

DRAM HDD

SSD SCM

Hybrid-Memory

Fixed amount of DRAM to buffer data from SSD

PCIe SSD as backing storage

Offer separate memory allocator !Accessed data via load/store operations

Use paging to buffer data of SSD on DRAM !Increase addressable amount of memory

Page 8: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

IMDBMS – SAP HANA based Architecture

8

Page 9: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

IMDBMS – SAP HANA based Architecture

Main MemoryLoaded

Column(s)

Data Request on Colum

nLi

mite

d am

ount

of M

emor

y

UnloadedTable

9

Page 10: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

IMDBMS – SAP HANA based Architecture

Main MemoryLoaded

Column(s)

Data Request on Colum

nLi

mite

d am

ount

of M

emor

y

UnloadedTable

0 2 4 6 8 10

20

40

60

80

In-Memory Table Size [GB]

Table

Load

Tim

e[s]

10

Page 11: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

IMDBMS – Scaling the Traditional Way

Scale-Up • Increase the amount of available memory

in a single server by adding DRAM

• Limitations:

• Fujitsu RX-600 S6 2 TB

• Fujitsu RX-900 S2 4 TB

• Fujitsu 2800B 12 TB !Scale-Out • Add additional cooperating instances

• Instances are separate (shared nothing)

• Additional complexity and cost e.g. on the infrastructure level!

11

Page 12: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

IMDBMS – Scaling w/ Hybrid-Memory

Main MemoryLoaded

Column(s)

Data Request on Column

Lim

ited

amou

nt o

f Mem

ory

UnloadedTable

Hybrid-MemoryLoaded

Column(s)

Load

Unload

On Dem

and

12

Page 13: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

IMDBMS – Scaling w/ Hybrid-Memory

Main MemoryLoaded

Column(s)

Data Request on Column

Lim

ited

amou

nt o

f Mem

ory

UnloadedTable

Hybrid-MemoryLoaded

Column(s)

Load

Unload

On Dem

and

0 2 4 6 8 10 12

100

200

300

400

500

600

Database Size [TB]

App

roxi

mat

edH

ardw

are

Cos

t[1000Euro]

IMDBMS using DRAM onlyIMDBMS using DRAM and Hybrid-Memory

13

Page 14: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Hybrid-Memory

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

321101

Values

IMDBMS – Scaling w/ Hybrid-Memory

DRAM

SSD14

Page 15: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Main Store

Persistence

Data Volumes Log Volumes

Delta Store

AlphaBeta

CharlieDelta

0123

Dictionary101232

Values

BetaDeltaAlpha

Charlie

0123

Dictionary

1312

Values

Merge

Read Operations Data Modifying Operations

Mem

oryStorage

Hybrid-Memory – Integrating

Hybrid-Memory

321101

Values

DRAM

SSD

• Hybrid-Memory is introduced on column level if data does not fully fit into DRAM !

• Minimize latency overhead by storing less frequently used columns on Hybrid-Memory first !

• Higher access skew on column level increases the probability of cache hits !

• Reduce large range selects on Hybrid-Memory as data is paged in 4kB chunks !• Avoid scan operations as sequential operations are not

beneficial on Hybrid-Memory

15

Page 16: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Fetch SQLPlan Cache

Parse SQL Plan Cache,

generate ASTGet next node

from AST

Analyze column access

Column used for

Projection

Column used for Selection

Not indexed

Indexed

Add to SeqAcc

Add to RndAcc

Add to RndAcc

While nodes in AST

Aggregate SeqAcc, RndAcc

SortSeqAcc, RndAcc

ascending

Append RndAcc to AccFreq

Calc. usage frequency for

SeqAcc, RndAcc

Append SeqAcc to AccFreq

Hybrid-Memory – Integrating

16

Fetch SQLPlan Cache

Parse SQL Plan Cache,

generate AST

Use existing information, do not generate additional statistics

Parse SQL statements executed in the past

Generate LFU list for columns

Page 17: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Evaluation – Partitioning TPC-C

17

80% of all columns used in less than 2% of all queries

Less frequently used columns cause 60% of the data volume

Page 18: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Evaluation – TPC-C Hybrid-Memory 80% : DRAM 20%

18

1008060402010560

70

80

90

100

Hybrid-Memory Buffer Size [%]

DB

MS

Thr

ough

put[%

]80% of all columns on Hybrid-Memory,20% on DRAM

DRAM buffer size for Hybrid-Memory is adjusted

Upper touchpoint, no paging

TPC-C performance in tpm% compared to unchanged system

10% DRAM buffer reduces tpm by 12%, index vector size reduced by 54%

Page 19: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Conclusion and Future Work

Conclusion:

• DRAM + SSD combined to Hybrid-Memory

• Hybrid-Memory made available to applications via a separate allocator

• Extended SAP HANA to use Hybrid-Memory as an additional data store

• Analyzed TPC-C to be partitioned for two data stores

• Ran TPC-C based evaluation reducing the index vector size by 54%

while keeping the IMDBMS performance at 88%

19

Page 20: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Conclusion and Future Work

Future Work:

• Evaluate our approach based on real data and workload

• Integrate column partitioning directly into HANA

• Introduce paging to other data stores (e.g. dictionary)

• Design a cost model for columns to be stored on Hybrid-Memory

including row level information

20

Page 21: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.©

Thank You! Questions? Comments?

An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases !

*Bernhard Höppner, °Ahmadshah Waizy, *Hannes Rauhe !

* SAP SE ° Fujitsu Technology Solutions GmbH

!ADMS’14 in conjunction with 40th VLDB

Hangzhou, China September 1, 2014

Page 22: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Fetch SQLPlan Cache

Parse SQL Plan Cache,

generate ASTGet next node

from AST

Analyze column access

Column used for

Projection

Column used for Selection

Indexed

Not indexed

Add to SeqAcc

Add to RndAcc

Add to RndAcc

While nodes in AST

Aggregate SeqAcc, RndAcc

SortSeqAcc, RndAcc

ascending

Append RndAcc to AccFreq

Calc. usage frequency for

SeqAcc, RndAcc

Append SeqAcc to AccFreq

Hybrid-Memory – Integrating

22

Analyze SQL Plan Cache on column level

Separate into random and sequential access

Maintain statistics in two independent lists

Get next node from AST

Analyze column access

Column used for

Projection

Column used for Selection

Not indexed

Indexed

Add to SeqAcc

Add to RndAcc

Add to RndAcc

While nodes in AST

Page 23: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Hybrid-Memory – Integrating

23

Aggregate SeqAcc, RndAcc

Calc. usage frequency for

SeqAcc, RndAcc

Fetch SQLPlan Cache

Parse SQL Plan Cache,

generate ASTGet next node

from AST

Analyze column access

Column used for

Projection

Column used for Selection

Not indexed

Indexed

Add to SeqAcc

Add to RndAcc

Add to RndAcc

While nodes in AST

Aggregate SeqAcc, RndAcc

SortSeqAcc, RndAcc

ascending

Append RndAcc to AccFreq

Calc. usage frequency for

SeqAcc, RndAcc

Append SeqAcc to AccFreq

Page 24: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

Hybrid-Memory – Integrating

24

SortSeqAcc, RndAcc

ascending

Append RndAcc to AccFreq

Append SeqAcc to AccFreq

Add randomly accessed columns first as they are more suited for Hybrid-Memory

Reduce access on Hybrid-Memory by keeping frequently used columns on DRAM

Workload adapted priority list for Hybrid-Memory

Fetch SQLPlan Cache

Parse SQL Plan Cache,

generate ASTGet next node

from AST

Analyze column access

Column used for

Projection

Column used for Selection

Not indexed

Indexed

Add to SeqAcc

Add to RndAcc

Add to RndAcc

While nodes in AST

Aggregate SeqAcc, RndAcc

SortSeqAcc, RndAcc

ascending

Append RndAcc to AccFreq

Calc. usage frequency for

SeqAcc, RndAcc

Append SeqAcc to AccFreq

Page 25: An Approach for Hybrid-Memory Scaling Columnar In-Memory

© 2014 SAP SE and Fujitsu Technology Solutions GmbH. All rights reserved.

© 2014 SAP SE or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.

Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.

National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.