sudarsun kannan, ada gavrilovska, karsten schwancerin/nvm-hpdic14-5.pdf · non-volatile memory...

47
Future Server Platforms: Persistent Memory for Data-intensive Applications Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan CERCS Research Center Georgia Institute of Technology

Upload: others

Post on 30-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Future Server Platforms: Persistent

Memory for Data-intensive Applications

Sudarsun Kannan,

Ada Gavrilovska, Karsten Schwan CERCS Research Center

Georgia Institute of Technology

Page 2: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Background – ‘Big Data’ Research

– Scientific/Technical Computing – Scalable, Reliable Data

Access:

• High Performance I/O: in-situ and in-transit processing for HPC I/O:

• DOE SDAV, ExACT, SDM awards (ORNL, LBL); DOE Sandia

joint work on resilience (IPDPS12, SC13, …).

• Heterogeneous multicore platforms (+accelerator-based systems):

• DOE ExaOS award (Sandia, ORNL, LBL); Intel NVM award (on

clients); additional collab. with Microsoft (on servers); HP Labs

collaboration (HPCA13, TRIOS13, …).

In-situ

&

In-transit

data

processing

initialize shuffle reduce

reduce

finalize

initialize shuffle reduce finalize

fetch

fetch

map map map map map map

map map map map map map

Page 3: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Background – cont.

– Enterprise and Cloud Computing – Fast Data:

• QoS Clouds (IPDPS13); I/O virtualization and IB bypass (Cluster 13).

• ‘Monalytics’: real-time data monitoring and analytics; online troubleshooting; scalable Flume-based benchmark suite; annotated biblio on troubleshooting (Middleware13, ACM SigOps14, ICAC 14, …). Scalable Flume-based data streaming benchmark.

• ELF: ELastic Fast data processing (ATC 14); Nectere benchmark.

• Data-intensive applications on GPUs:

• SQL operators on GPUs (Yalamanchili); PGAS for extended

memory (Cluster13, GPGPU13, GPGPU14, CGO14, …). • Note: Big Data management track in ICAC conference (with HP).

Front - end Middle-tier Application Logic Data Base

I n t e

r n e

t

Pro

xy S

erv

er

Page 4: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

General (VM1)

Virtual Platform

VCPU1 VCPU2

Hybrid (VM2)

Virtual Platform

VCPU3 aVCPU4

Hybrid (VM3)

Virtual Platform

VCPU5 aVCPU6 sVCPU7

Heterogeneity-aware Hypervisor (Xen) + Management Domain (Dom0)

VMs’ Compute Personality

aVCPU4 aVCPU6

VMs’ Crypto Personality

sVCPU7

VMs’ General Personality

VCPU1 VCPU2

VCPU3 VCPU5

Privileged Software Privileged Software (Physical Resource Manager)

Federated Schedulers Heterogeneous

Personality Scheduler (s)

General + Asymmetric Personality Scheduler

Homogeneous View Heterogeneous View Future Server Platrforms - Spectrum of Heterogeneity

Heterogeneous Many-core Platform

Accelerators

aPCPU1 (Compute)

aPCPU2 (Network)

Socket1 Socket2

PCPU PCPU

PCPU PCPU

Cach

e

sPCPU sPCPU

sPCPU sPCPU

Cach

e Platform Interactions Platform Interactions Cooperative Platform Interactions

Observed Parameters

Mem

ory/In

tercon

nect

Page 5: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Hetero Processors => Hetero Memory

Growth in data intensive applications, coupled with

increased node core counts and thread parallelism

Demands increased on-node memory capacities:

Toward exascale systems:

Science simulations using experimental data or

co-running with online data analytics

Next generation server and cloud platforms:

Heterogeneous server nodes for perceptual and

cognitive applications (zillians, …)

End client devices:

Apps. with rich features and data, enabled by

Machine Learning, Graph Processing, MR, …

Page 6: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Motivation

Power (or Battery) constraints, along with cost, prevents use of

DRAM to address

all of these needs.

SSD/Nand-Flash remains comparatively slow, so …

Page 7: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Non-Volatile Memory (NVM)

to the Rescue

NVM (e.g., PCM) is byte addressable

Provides persistence – 100x faster than SSD

Higher density compared to DRAM (~128GB)

NVM as low power memory in future Exascale Platforms

Technical approach: Use NVM `as memory’ vs. `as storage’ (under the I/O

stack): to enable rapid ‘compute’ on data

Use processor caches to reduce write latency impact and

improve endurance (4x-10x slower writes, and limited

endurance (~108 writes))

`Enable’ applications to use NVM + `NVM-aware’ systems

Page 8: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVM: Why use it as ‘Memory’?

Example: end client devices and applications

NVM (and/or the slower SSD devices) used via I/O APIs:

High software overheads for block-based I/O interfaces

Low per call data sizes, hence more calls

Just using 'mmap‘ not sufficient:

every mmap/munmap call implies a user/kernel transition

requires multiple supporting POSIX calls like open, close

Page 9: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVM as Memory:

Prior Work: DRAM as Cache

Processor

Cache NVM

DRAM Page

Cache CPU

DRAM acts like a page cache

Good for addressing `capacity only’ needs

May work well for server machines with TBs of DRAM

Power/performance issues for exascale or end client codes

Page 10: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Alternative: Fast Non-Volatile Heap

Processor

Cache NVM CPU

To Make Persistence Guarantees:

Frequent cache flushing, memory fencing, writes to PCM

High persistence management overheads

Includes user and kernel level overheads

Page 11: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Prior Research: use NVM either for persistence or

as an additional `capacity heap’

pMem: use NVM for persistence and for add’tl capacity

NVM for persistence - “NVMPersist”

NVM for capacity as a heap - “NVMCap”

NVMCap and NVMPersist threads share use of the

same last level cache

Our Approach: pMem: Dual-Use NVM

Page 12: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVM Dual Use – High Level View

App thrd1

App thrd2

Processor

Last Level

Cache

DRAM

NVMCap. Heap

NVMPersist. Heap

Page 13: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVM-pMem: Dual Use Interface

NVM is with partitioned capacity & persistence zones

User level NVM Library

APP

DRAM Capacity

Zone

Persist

Zone

NVM Node

Kernel

Zone

Kernel Layer

Page 14: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem: Dual Use NVM

User level NVM Library

Capacity

Zone

Persistence

Zone

APP

Kernel Layer

DRAM

NVM Node

CapMalloc(size)

Page 15: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem: Dual Use NVM

User level NVM Library

Capacity

Zone

Persistence

Zone

APP

Kernel Layer

DRAM

NVM Node

CapMalloc(size)

Application decides when to use NVM for capacity

NVM used as heap without persistence

User level and kernel managers route application calls

Think of persistent metadata as a light weight file system

metadata.

Page 16: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem: Dual-Use NVM

User level NVM Library

Capacity

Zone

Persistence

Zone

APP2

Kernel Layer

DRAM

NVM Node

PersistMalloc(size)

PersistMalloc(size)

Page 17: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Key Ideas:

Application-level control:

Suitable library-based interfaces for p vs. np data

Expensive I/O calls replaced with ‘memory’ accesses

Goal: Reduced software use (includes OS)

System-level:

Deploy NVM as OS `memory node’

NVM ‘node' partitioned into volatile + persistent heap

NUMA-like kernel allocation policies

Advantages

Dual-benefit NVM: capacity + fast persistence

pMem: Dual-Use NVM

Page 18: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Enabling Persistence Support

hash *table = PersistAlloc(entries, "tableroot");

for each new entry:

entry_s *entry = PersistAlloc (size, NULL);

table = entry;

count++;

temp_buff = CapAlloc(size);

Requires persistence

metadata in library & OS

No persistent metadata required

Plus the following additional requirements:

flush app. data cache to avoid loss on power failure

flush OS data-structures and library metadata

Page 19: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Dual-Use Solution Challenges:

`Persistence Impact’ on NVMCap

`Persistence-unaware’ OS page allocations cause

cache conflicts between NVMCap and NVMPersist

Persistent library allocator metadata maintenance

increases flushes

Increases NVMCap cache misses for shared data

Transactional (durability) logging of persistent

application state increases flushes and NVM Writes

Page 20: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Persistence Increases #Cache Misses

Atom platform with 1MB LLC

MSR counters to record LLC misses

Page 21: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem Implementation:

Emulated NVM Node for Persistence

On boot, configure an OS memory node to emulate NVM

All NVM pages locked, swapping disabled

Provides persistence across application sessions

For persistence across boots, write it to SSD

Paging uses allocate on write policy

Cache line flushes for user level data and for kernel

data-structures

Page 22: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Process 1

compartment1

pages

compartment2

RB tree

Process 2 Process 3

List of processes

Uses process id, compartment id and fault address to identify the page

1 bit for each NVM page flag and 1 bit flush flag

pMem Implementation - Kernel

Compartments:

large region of NVM allocated by user-level NVM manager

using nvmmap

they are virtual memory area structures (VMA)

apps. can explicitly request separate compartments (‘nvmmap’)

isolates persistent from non-persistent NVM regions

Page 23: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Application

allocates in chunks

chunks

To kernel layer

pMem user level

memory manager

Modified jemalloc to support user level persistence

Provides application interfaces like “capmalloc”, “persistmalloc”,

logging, and application transparent non-persistent allocations

Manages application data in chunks

Implemented by extending the jemalloc library

pMem Software Architecture - Allocator

Page 24: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Methods for Cache Conflict Reduction

Co-running NVMPersist and NVMCap increases cache

conflicts

Solution: Cache partitioning

Hardware techniques: little flexibility

Software techniques: page coloring complex

(FreeBSD)

Focused on allocating physically contiguous pages

to application

Software-based partitioning – ‘page bucket’ solution

to allocating persist vs. cap pages

Page 25: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Conflict Unaware JIT Allocation

Phys Frames

NVMCap, Pg 1

NVMPersist, Pg 2

NVMPersist, Pg 3

NVMCap, Pg 4

NVMPersist, Pg 5

NVMCap, Pg 6

….

Tag Way1 Way 2

NVMCap, Pg 1

NVMCap, Pg 1

NVMPersist, Pg 2

NVMPersist, Pg 2

NVMPersist, Pg 3

NVMPersist, Pg 3

NVMCap, Pg 4

NVMCap, Pg 4

Conflicts

Current OS uses Just In Time (JIT) - allocates

pages on first touch

Reduces physical contiguity of pages with

increasing no. of threads

Page 26: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Ideal Conflict-Free Allocator

Phys Frames

NVMCap, Pg 1

NVMCap, Pg 2

NVMCap, Pg 3

NVMPersist, Pg 4

NVMPersist, Pg 5

NVMPersist, Pg 6

….

Tag Way1 Way 2

NVMCap, Pg 1

NVMCap, Pg 1

NVMCap, Pg 2

NVMCap, Pg 2

NVMCap, Pg 3

NVMCap, Pg 3

NVMPersist, Pg 4

NVMPersist, Pg 4

Physically contiguous page allocation reduces conflicts

We propose a simple design to achieve contiguity

No Conflicts

Page 27: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

CAA - Reduction in Contiguity Misses

0

20

40

60

80

100

x264

conv…

ani…

mcf

libq…

sjen

g

povra

y

sople

x

lbm

om

n…

asta

r…

Red

uct

ion i

n P

age

Con

tiguit

y M

iss

Rel

ativ

e to

JIT

all

oca

tion

End Client Apps. SPEC Benchmarks

CAA-4 CAA- 16

Contiguity-Bucket-based Page Allocator

• Reduces page contiguity misses by 89%

Page 28: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVMCap Cache Miss Reduction

-6-4-202468

10

Red

uct

ion i

n C

ache

Mis

ses

(%)

rela

tive

to B

asel

ine

End Client Apps. Spec Bench

CAA -4CAA -16

Beneficial for apps with large memory footprints

Adding more pages to bucket can increase cache misses

due to linked list traversal

Page 29: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Agenda

Motivation

Dual use of NVM Heap

High level Design

Programming Interface

Sources of Persistence cost in dual use of NVM

Optimizations to Reduce Persistence Cost

Cache conflict aware allocation

Library allocator optimization

Hybrid Logging

Conclusions and Future Work

Page 30: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Library Allocator Overhead

Nonvolatile heaps require user-level allocator

Modern allocators use complex data structures

Placing complex allocator structures in NVM

requires multiple cache line flushes

Increases cache misses and NVM writes for

NVMPersist and NVMCap

Page 31: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Porting DRAM Allocators for NVM

Persistence support for JEMalloc ~4 CLFlush/ alloc.

Page 32: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVM Write Aware Allocator (NVMA)

Allocator complexity “independent of” NVM support

Idea: place complex allocator structures into DRAM

NVM contains only log of allocations and deletions

C1 C2 C3 ..... ....

C1,C2 indicates log of allocated chunks

Flush only log information to NVM (~2 lines)

Page 33: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

NVMA – Cache Flush Reduction

0

2

4

6

8

10

0.60 0.80 1.00 1.50

Incre

ase in

ca

che

mis

s(%

) com

pare

d t

o b

aselin

e

No. of Hash Operations (In Millions)

NVMA…

2%

8x less CLflush

Page 34: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Logging Overheads

Logging required for apps. with strong durability

requirements

Logs must be frequently flushed to NVM

Current Word/Object logs increase NVM writes

Word based logs: High log metadata/ log data ratio

Object log: Logs entire object even for a word change

Page 35: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Hybrid Log Design

Hybrid log to address word/object granularity tradeoffs

Flexible use of word/object logs in same transaction

Applications specify the transaction type

Word- and object-based logs are maintained separately

Page 36: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Optimization - Miss Reduction

Reduces misses by 1-2% compared to CAA+

NVWA

With increasing rate of hash operations, more gains

-5

0

5

10

15

Red

uct

ion

in M

isse

s (%

)

rela

tive

to J

IT a

lloca

tion

End Client Apps. Spec Bench

CAA + NVWA +Hybrid

Page 37: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Estimated Impact on Runtime

Half-Half : Half the misses reduced are NVM writes

Full Writes: All misses reduced are NVM writes

One-third: 1/3 of misses reduced are NVM writes

Gains in runtime improvement can be substantial

with optimizations that reduce misses by ~2%

-20

0

20

40

Red

uct

ion

in e

xec

uti

on

tim

e (s

ec)

rela

tiv

e to

bas

elin

e

End Client Apps. Spec Bench

Half and…Full Writes

Page 38: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Summary & Future Work

Efficient use of NVM requires cross-stack changes to applications and

systems

Analysis of dual use NVM shows potential high impact of NVMPersist on

NVMCap

Impact reduced via: page contiguity, NVM-aware user library allocator, and

hybrid logging

Improvements result in ~12%-13% reduced cache misses, with consequent

substantial gains in applications performance

Future work: DRAM vs. NVM data structures (e.g., OS allocator)

Analysis of power implications

=> ‘Think Memory’, not cores!

Page 39: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Shared Platforms: Performance Effects

• Current hypervisors are limited in their ability to meet

performance needs and isolation for multiple hosted

Applications

• Application performance depends on resources beyond

CPU + Memory: shared resources: Memory Bandwidth*, I/O

• Shared resources not as easily partitioned as CPU, Memory

in hardware (limited support e.g., NUMA)

• Application resource requirements are

elastic along multiple resource dimensions

• State of art hypervisor resource managers and arbitration

offer inadequate solutions to manage elasticity with isolation

39

Page 40: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Shared Platforms: Performance Effects

• Arbitrary interference for resource shares may have

detrimental performance implications

• Some applications more sensitive to interference than others

• Interference: an application’s resource shares imposing on

another’s sensitive resource shares

40

Page 41: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Challenges for Resource Managers of Shared Platforms

How to improve isolation of multiple performance properties,

leveraging limited support in hardware?

How to further efficiently arbitrate and manage elastic

application resource demands when multiple varying

performance requirements need to be met?

Costs of resource-reallocation to maintain elasticity may be

non-trivial.

41

Page 42: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Thanks!

‘’Think Memory’: encouraging you to

rethink ‘storage, I/O, and ‘memory’

(usage and management) for future

multicore platforms

Page 43: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

lIntel Restricted Secret l43

Applications Analyzed… OpenCV based FaceRecognition *

Generates training database from images

Database is in a XML format

For recognition, Eigen vector analysis of source with database

(150 image database with train/recognize)

Snappy Compression *

Fast compression library from google.

Importance on speed rather than compression ratio (2GB)

JPEG Conversion Library

Standard Linux/Windows JPEG library

We use the JPEG to BMP conversion utility

Most time spent on image decoding (similar to X264)

Read and Write Intensive (5000 images)

Crime/DBACL – Diagramic Bayesian Classification (Machine Learning)

Classified user emails/documents (500 MB of documents)

Page 44: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem - Memory (DRAM) Usage

Mem

ory

usa

ge

(MB

)

Blck-SSD

pMem- IO

pMem- Full

pMem I/O – NVM only for persistence

pMem- Full – NVM for persistence and as additional memory

Blck-SSD – Block-based SSD usage

Page 45: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

pMem – Page Access Latencies

NVM (pMem) Mneymosyne DRAM

0

500000

1000000

1500000

2000000

2500000

3000000

3500000

Page a

ccess

late

ncy

fo

r 2

00

0 p

ages

Page 46: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

FaceRec JPEG Snappy gthumb Crime

-40

-20

0

20

40

60

80

100

120

pMem M-RD

Red

ucti

on

(%

) rela

tiv

e t

o B

lck

I/O

M-SSDM-RD

pMemBlck-RD

M-SSDM-RD

pMemBlck-RD

M-SSDM-RD

pMemBlck-RD

M-SSDM-RD

pMemBlck-RD

M-SSDM-RD

pMemBlck-RD

0

10

20

30

40

50

60

70

System time(sec)

User time (sec)

Execu

tio

n t

ime (

sec)

FaceRec JPEG Snappy CrimeGThumb

pMem for Persistence - Performance Gains

•RD – RamDisk, M-mmap, Blck- Block based access

•Worst case: 4%-6% overhead compared to DRAM when

using NVM for execution and storage

•Avoids high context switch costs compared to 'mmap'

User-kernel switch

reduction

relative to Blck-IO

Page 47: Sudarsun Kannan, Ada Gavrilovska, Karsten Schwancerin/NVM-HPDIC14-5.pdf · Non-Volatile Memory (NVM) to the Rescue NVM (e.g., PCM) is byte addressable Provides persistence – 100x

Optimizing Checkpoints Using NVM as Virtual Memory

Sudarsun Kannan, Dejan Milojicic

Ada Gavrilovska, HP Labs (Palo Alto)

Karsten Schwan

CERCS - Georgia Tech