decompressing snappy compressed files at the speed of …...decompressing snappy compressed files at...

31
1 Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang TU Delft

Upload: others

Post on 29-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

1

Decompressing Snappy Compressed Files at the Speed of OpenCAPI

Speaker: Jian Fang TU Delft

Page 2: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

2

NVMe

SNA

P

Dec

om-

pres

sion

Flet

cher

Memory

ARROW

Tables

OpenCAPI OpenCAPI

SNA

P

Flet

cher

Act

ions

HBM HBM

FPGA FPGA

Sort

Join

Skyline

… CPU POWER9

Spark DB DNA

Seq

… Current Project

OpenCAPI

GPU N

V Li

nk

Scalable Heterogeneous Accelerated DatabasE

SHADE

Page 3: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

3

Big Data Processing Database Analytics

Data Movement

Page 4: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

4

• Moving Data from/to Storage • Slow, Bottleneck • 1st step in data processing and database analytics

• Decompress-filter Solutions

storage CPU Raw Data

storage FPGA Compressed Data CPU Decompressed

Data

Filtered Data

Netezza

GZIP

LZ4 RLE LZW

???

Decompress-filter

Page 5: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

5

Snappy compression Algorithm

• LZ77-based, byte-level, lossless

• In Hadoop ecosystem

• Support Parquet, ORC, etc.

• Low compression ratio

• Fast compression and decompression

Core i7

500MB/s Per Core

OpenCAPI Speed

Goal

Page 6: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

Mechanism of Snappy compression

• Two kinds of token: Literal token and copy token • Find match from previous data • Example:

7

ABCDEFGHABCD

offset 8, length4

Use (8,4) to replace “ABCD”

Page 7: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

Snappy Compression

8

… ABCDEFGHABCD

Literal

copy : offset 8 length 4

No Match!

… ABCDEFGHABCD

Match!

… (Literal, “ABCDEFGH”) (Copy, 8, 4) Output

Input

Generate a literal token

Find next match

Generate a copy token

Page 8: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

Snappy Decompression

9

… ABCDEFGHABCD

Copy

(Literal, “ABCDEFGH”)

… ABCDEFGH

(Copy, 8, 4)

Page 9: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

10

Snappy compression Algorithm

• An independent 64KB history for every 64KB data • Token: copy or literal, copy lenth<=64B • Token size: 2 Byte minimum, average ≈< 4

Byte(assumption for design) • Highly data dependent

Benchmark Compression Ratio Average Token Size DNA1 3.31 5.79 DNA2 2.36 7.38 DNA3 2.57 7.03 TPC-H 2.53 2.72

… … …

DNA data source: ftp://ftp.ncbi.nlm.nih.gov/ncbi-asn1

Page 10: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

11

Design in FPGA

• An independent 64KB history for every 64KB data • Token: copy or literal, copy lenth<=64B • Token size: 2 Byte minimum, average ≈< 4

Byte(assumption for design) • Highly data dependent

DNA data source: ftp://ftp.ncbi.nlm.nih.gov/ncbi-asn1

4KB BRAM

16X

… 64KB

history

Page 11: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

13

Design Challenge

• Deal with different types of tokens

• Handle various size of tokens

• Parallelize token translation

• Keep up with the interface bandwidth

Page 12: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

14

Design Challenge

• Deal with different types of tokens

• Handle various size of tokens

• Parallelize token translation

• Keep up with the interface bandwidth

Page 13: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

15

Design Challenge

• Deal with different types of tokens

• Handle various size of tokens

• Parallelize token translation

• Keep up with the interface bandwidth

Page 14: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

16

Token Structure

• LZ77-based, byte-level • In Hadoop ecosystem • Support Parquet, ORC, etc. • Two types of tokens: literal tokens and copy tokens

Source: Y. Qiao. An fpga-based snappy decompressor-filter. Master’s thesis, Delft University of Technology, 2018.

Literal Token: (literal, length, data) Copy Token : (copy, offset, length)

Literal Token

Page 15: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

17

Token Structure

• LZ77-based, byte-level • In Hadoop ecosystem • Support Parquet, ORC, etc. • Two types of tokens: literal tokens and copy tokens

Source: Y. Qiao. An fpga-based snappy decompressor-filter. Master’s thesis, Delft University of Technology, 2018.

Copy Token

Tokens are in various length

Page 16: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

18

Prior Solutions: Decode all possiblities

Source: Y. Qiao. An fpga-based snappy decompressor-filter. Master’s thesis, Delft University of Technology, 2018.

Page 17: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

19

Design Challenge

• Deal with different types of tokens

• Handle various size of tokens

• Parallelize token translation

• Keep up with the interface bandwidth

Page 18: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

20

Token Dependency

Read Write

Read Write Read

Write Read Write

Token1 Token 2 Token3 Token 4

• Read – Read • No dependency

• Write – Write • Never write same bytes • No dependency

• Write – Read • Forwarding

Page 19: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

21

Token Dependency – Address Conflict

16X

• Write – Write • Same block, same line

BRAMs

Page 20: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

22

Token Dependency – Address Conflict

16X

• Write – Write • Same block, different lines

BRAMs

Page 21: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

23

Token Dependency – Address Conflict

16X

• Read – Read

BRAMs

Page 22: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

24

Design Challenge

• Deal with different types of tokens

• Handle various size of tokens

• Parallelize token translation

• Keep up with the interface bandwidth

token size * token/cycle * frequency * instances = OpenCAPI BW

Token size: 4B 2 tokens/cycle 250MHz --> 13 instances

Page 23: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

25

IDEA – from token-level to BRAM-level

16X

16X

Page 24: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

26

Architecture Overview

Page 25: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

27

Architecture Overview

• Two-level Parser

Page 26: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

28

Architecture Overview

• Two-level Parser

Page 27: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

29

Architecture Overview

• Two-level Parser

• Parallel BRAM Operations

Page 28: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

30

Architecture Overview

• Two-level Parser

• Parallel BRAM Operations

• Relax Execution Model

Page 29: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

31

Parallel BRAM Operations & Relax Execution Model

Page 30: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

32

Results

• Implementation Result (on XCKU15P)

• Benchmark • Alice’s Adventures in Wonderland : 85KB (149KB) for

compressed (uncompressed) file.

• Throughput

Flip-Flop LUTs BRAMs Frequency Power

32K(3.32%) 46K(8.95%) 29(2.95%) 250MHz 2.9W

FPGA(single decompressor)

CPU(Core i7 with one thread)

Input 3GB/s 300MB/s

Output 5GB/s 500MB/s

Page 31: Decompressing Snappy Compressed Files at the Speed of …...Decompressing Snappy Compressed Files at the Speed of OpenCAPI Speaker: Jian Fang . TU Delft. 2 NVMe SNAP OpenCAPI Decom-pression

33

Contact Me: [email protected] (Jian Fang)

Open Source

https://github.com/ChenJianyunp/ FPGA-Snappy-Decompressor.git

Publish Paper: J. Fang, J. Chen, P. Hofstee, Z. Al-Ars, J. Hidders, A High-Bandwidth Snappy Decompressor in Reconfigurable Logic (CODES+ISSS 2018)