notary: hardware techniques to enhance signatures

40
Notary: Hardware Techniques to Enhance Signatures Luke Yen Collaborator: Prof. Stark C. Draper Advisor: Prof. Mark D. Hill University of Wisconsin, Madison MICRO-41 - November 11, 2008 www.cs.wisc.edu/multifacet/papers/micro08_notary.pdf

Upload: lecea

Post on 05-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Notary: Hardware Techniques to Enhance Signatures. Luke Yen Collaborator: Prof. Stark C. Draper Advisor: Prof. Mark D. Hill University of Wisconsin, Madison MICRO-41 - November 11, 2008 www.cs.wisc.edu/multifacet/papers/micro08_notary.pdf. Executive Summary. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Notary: Hardware Techniques to Enhance Signatures

Notary:Hardware Techniques to Enhance Signatures

Luke Yen

Collaborator: Prof. Stark C. Draper

Advisor: Prof. Mark D. Hill

University of Wisconsin, Madison

MICRO-41 - November 11, 2008www.cs.wisc.edu/multifacet/papers/micro08_notary.pdf

Page 2: Notary: Hardware Techniques to Enhance Signatures

Executive Summary

Tackle 2 problems with hardware signatures:

• Problem 1: Best signature hashing (i.e., H3) has high area & power overheads

• Solution 1: Use entropy analysis to guide lower-cost hashing (Page-Block-XOR, PBX) that performs similar to H3

– Ex: 160 gates for H3 vs 20 gates for PBX

• Problem 2: Spurious signature conflicts caused by signature bits set by private memory addrs

• Solution 2: Avoid inserting private stack addrs, propose privatization interface for higher performance

04/22/23 University of Wisconsin-Madison2

Page 3: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison3

Page 4: Notary: Hardware Techniques to Enhance Signatures

Signature background• Signatures (hardware Bloom filters) used to summarize and detect

conflicts with a transaction’s read- and write-sets– Inspired by Bulk system [Ceze,ISCA’06]– Implemented in LogTM-SE [Yen,HPCA’07]– Can have false positives, but never false negatives– Also proposed for non-TM purposes (e.g., SC violation detection,

atomicity violation detection, race recording)• Ex: Use k Bloom filters of size m/k, with independent hash functions

04/22/23 University of Wisconsin-Madison4

Page 5: Notary: Hardware Techniques to Enhance Signatures

Signature hash functions

• Which hash function is best? [Sanchez, MICRO’07]– Bit-selection? Hash simply decodes some number of input bits

– H3? Each bit of a hash value is an XOR of (on avg.) half of the input address bits

04/22/23 University of Wisconsin-Madison5

• Result: H3 better with >=2 hash functions• However, H3 uses many multi-level XOR trees

•Can we improve this?

LogTM-SE w/ 2kb signatures

Page 6: Notary: Hardware Techniques to Enhance Signatures

H3 implementation

• Num XOR

• Ex: 2kb signatures, k=2, c=10, 32-bit addr = 160 XOR gates per signature

• Can we reduce the total gate count?

04/22/23 University of Wisconsin-Madison6

kcbitsinlengthaddr

4

Page 7: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison7

Page 8: Notary: Hardware Techniques to Enhance Signatures

Entropy overview

• Not all address bits have equal randomness– Ex: High-level address bits unlikely to change if working set

size is small

• Key insight: If input bits are random and those bits are used as inputs to hash functions, random hash values result– Use entropy to measure bit randomness

• Entropy – measure of the uncertainty of a random variable x

04/22/23 University of Wisconsin-Madison8

Page 9: Notary: Hardware Techniques to Enhance Signatures

Entropy formally defined

• Entropy =

• p(xi) = the probability of the occurrence of value xi

• N = number of sample values random variable x can take on

• Entropy = amount of information required on average to describe outcome of variable x (in bits)– Ex: What is the best possible lossless compression?

04/22/23 University of Wisconsin-Madison9

N

iii xpxp

12 ))((log)(

n-bit field has constant value

All bit patterns in n-bit field equally likely

Entropy value of n-bit field

0 bits n bits

min max

Other cases

Page 10: Notary: Hardware Techniques to Enhance Signatures

Our measures of entropy

• For our workloads, we care about:• Q1: What is the best achievable entropy?

– Global entropy – upper bound on entropy of address

• Q2: How does entropy change within an address?– Local entropy – entropy of bit-field within the address

04/22/23 University of Wisconsin-Madison10

Addr31 6

Global entropy

Addr31 6Local entropy

NSkip

Page 11: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison11

Page 12: Notary: Hardware Techniques to Enhance Signatures

Entropy results

• Workloads to be described later• Global entropy is at most 16 bits• Bit-window for local entropy is 16 bits wide (NSkip from 0-10)

– Smaller windows (<16b) may not reach global entropy value– Larger windows (>16b) hides some fine-grain info

04/22/23 University of Wisconsin-Madison12

Page 13: Notary: Hardware Techniques to Enhance Signatures

Entropy results summary

• More entropy results in our MICRO paper

• In summary, for our workloads entropy monotonically decreases when moving towards high-order bits– We calculate the average entropy across the entire

workload’s execution– May miss entropy changes due to program phase behavior

• Our Page-Block-XOR (PBX) hash takes advantage of this overall trend

04/22/23 University of Wisconsin-Madison13

Page 14: Notary: Hardware Techniques to Enhance Signatures

Page-Block-XOR (PBX)

• Motivated by 3 findings:– (1) Lower-order bits have most entropy

• Follows from our entropy results– (2) XORing two bit-fields produces random hash values

• From prior work on XOR hashing (e.g., data placement in caches, DRAM)

– (3) Bit-field overlaps can lead to higher false positives• Correlation between the two bit-fields can reduce the

range of hash values produced (worse for larger signatures)

04/22/23 University of Wisconsin-Madison14

Page 15: Notary: Hardware Techniques to Enhance Signatures

PBX implementation

• For 2kb signatures with 2 hash functions:– 20 XOR gates for PBX vs 160 XOR gates for H3!

04/22/23 University of Wisconsin-Madison15

• PPN and Cache-index fields not tied to system params:• Use entropy to find two non-overlapping bit-fields with

high randomness

Page 16: Notary: Hardware Techniques to Enhance Signatures

Summary thus far

• Problem 1: H3 has high area & power overheads

• Solution 1: Use entropy analysis to guide lower-cost PBX

– Ex: 160 gates for H3 vs 20 gates for PBX

• Problem 2: Spurious signature conflicts caused by signature bits set by private memory addrs

• Solution 2: To be described

04/22/23 University of Wisconsin-Madison16

Page 17: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison17

Page 18: Notary: Hardware Techniques to Enhance Signatures

Motivation

• False conflicts caused by thread-private addrs– Avoid conflicts if addrs not inserted in thread’s signatures

04/22/23 University of Wisconsin-Madison18

Page 19: Notary: Hardware Techniques to Enhance Signatures

Privatization solutions

• Two solutions proposed:– (1) Remove private stack references from sigs.

• Very little work for programmer/compiler• Benefits depend on fraction of stack addresses versus all

transactional references– (2) Language-level interface (e.g., private_malloc(), shared_malloc())

• Even higher performance boost• For skilled programmer• WARNING: Incorrectly marking shared objects as private can lead

to program errors!

04/22/23 University of Wisconsin-Madison19

Page 20: Notary: Hardware Techniques to Enhance Signatures

Page-based implementation

• Each page is assigned a status, private or shared– Invariant: Page is shared if any object is shared

• If stack is private, library marks stack pages as private• If using privatization heap functions, mark heap pages

accordingly

04/22/23 University of Wisconsin-Madison20

Page 21: Notary: Hardware Techniques to Enhance Signatures

OS support

• OS allocates different physical page frames for shared and private pages– Sets a per-frame bit in translation entry if shared– Reduce number of page frames used by packing objects

with same status together

• Signatures insert memory addresses of transactional references to shared pages– Query page sharing bit in HW TLB & current transactional

status

04/22/23 University of Wisconsin-Madison21

Page 22: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison22

Page 23: Notary: Hardware Techniques to Enhance Signatures

Methodology

• Full-system simulation using Simics and Wisconsin GEMS timing modules

• Transistor-level design for area & power of XOR gates• CACTI for Bloom filter bit array area & power

• Simulated system– Single-chip CMP– 16 single-threaded,in-order cores– 32kB, 4-way private L1 I & D, write-back– 8MB, 8-way shared L2 cache– MESI directory protocol– Signatures from 64b-64kb (8B-8kB) & “Perfect”

04/22/23 University of Wisconsin-Madison23

Page 24: Notary: Hardware Techniques to Enhance Signatures

Workloads

• Micro-benchmarks– BTree – read and write ops on shared tree– Sparse Matrix – algorithm from dense column vector

multiplication kernel

• SPLASH-2 apps – Barnes & Raytrace – exert most signature pressure

• Stanford STAMP apps – Vacation, Genome, Delaunay, Bayes, Labyrinth

• DNS server– BIND

04/22/23 University of Wisconsin-Madison24

Page 25: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison25

Page 26: Notary: Hardware Techniques to Enhance Signatures

PBX vs H3 area & power

• Area & power overheads (2kb, k=4):

04/22/23 University of Wisconsin-Madison26

Type of overhead

Bloom filter bit array

H3 hash PBX hash

H3 sig. PBX sig. % savings for PBX sig.

Area(mm2)

2.70e-2 8.10e-3 4.70e-4 3.50e-2 2.70e-2 23

Power(mW)

1.80e2 1.04e1 1.02 1.90e2 1.81e2 4.7

Page 27: Notary: Hardware Techniques to Enhance Signatures

PBX vs H3 execution time

04/22/23 University of Wisconsin-Madison27

PBX performs similar to H3

Additional workload results in paper

Page 28: Notary: Hardware Techniques to Enhance Signatures

Privatization results summary

• Removing private stack references from signatures did not help much– Most addr references not to stack– Most likely because running with SPARC ISA. Other ISAs

(e.g., x86) likely has more benefits

• Privatization interface helps four workloads– Remainder either does not have private heap structures or

does not have high transactional duty cycle

04/22/23 University of Wisconsin-Madison28

Page 29: Notary: Hardware Techniques to Enhance Signatures

Privatization interface results

04/22/23 University of Wisconsin-Madison29

Page 30: Notary: Hardware Techniques to Enhance Signatures

Outline

• Signature background

• Entropy

• Entropy results & PBX

• Privatization

• Methodology & workloads

• Results

• Conclusions & Future Work

04/22/23 University of Wisconsin-Madison30

Page 31: Notary: Hardware Techniques to Enhance Signatures

Conclusions

• Tackle 2 problems with signature designs:– (1) Area and power overheads of H3 hashing

• E.g., 160 XOR gates for H3, 20 for PBX

– (2) False conflicts due to signature bits set by private memory references

• Our solutions:– (1) Use entropy analysis to guide hashing function (PBX), a

low-cost alternative that performs similarly to H3

– (2) Prevent private stack references from entering signatures, and propose a privatization interface for heap allocations

• Notary can be applied to non-TM uses:– PBX hashing can directly transfer

– Privatization may transfer if addr filtering applies

04/22/23 University of Wisconsin-Madison31

Page 32: Notary: Hardware Techniques to Enhance Signatures

Future Work

• Dynamic entropy calculation:– How to adapt PBX hashing to entropy changes over time?

• Dynamic privatization characteristics:– How common is it for objects to change sharing status (i.e.,

from private to shared, and vice versa)?

04/22/23 University of Wisconsin-Madison32

Page 33: Notary: Hardware Techniques to Enhance Signatures

BACKUP SLIDES

04/22/23 University of Wisconsin-Madison33

Page 34: Notary: Hardware Techniques to Enhance Signatures

Privatization interface

04/22/23 University of Wisconsin-Madison34

Privatization function Usage

shared_malloc(size),private_malloc(size)

Dynamic allocation of shared and private memory objects

shared_free(ptr),private_free(ptr)

Frees up memory allocated by shared or private allocators

privatize_barrier(num_threads, ptr, size),publicize_barrier(num_threads, ptr, size)

Program threads come to a common point to privatize or publicize an object. Must be used outside of transactions

Page 35: Notary: Hardware Techniques to Enhance Signatures

Dynamic privatization

• Dynamically switch from private to shared, and vice versa

• If transitioning from private -> shared, safe to mark page as shared (at cost of performance)

• If transitioning from shared -> private, default policy is to disallow if there exists other shared objects on same page• Otherwise, trap to user software and let

programmer call shared_free(), followed by private_malloc() on object

04/22/23 University of Wisconsin-Madison35

Page 36: Notary: Hardware Techniques to Enhance Signatures

Bit-field overlaps harmful for PBX

04/22/23 University of Wisconsin-Madison36

Page 37: Notary: Hardware Techniques to Enhance Signatures

Removing stack refs doesn’t help significantly

04/22/23 University of Wisconsin-Madison37

Page 38: Notary: Hardware Techniques to Enhance Signatures

Entropy of commercial workloads

04/22/23 University of Wisconsin-Madison38

Page 39: Notary: Hardware Techniques to Enhance Signatures

04/22/23 University of Wisconsin-Madison39

Signature Operation Example

Program:

xbegin

LD A

ST B

LD C

LD D

ST C

0000000000000100

00000010

0010010000100100

00100010

Hash Function(s)

00000000

R

W

ABCDExternal ST E

00100100

00100010

ALIASFALSE POSITIVE:CONFLICT!

External ST F

00100100

00100010

NO CONFLICT

Page 40: Notary: Hardware Techniques to Enhance Signatures

Type of Hash Functions

• In real programs, addresses neither independent nor uniformly distributed (key assumptions to derive PFP(n))

• But can generate hash values that are almost uniformly distributed and uncorrelated with good (universal/almost universal) hash functions

• Hash functions considered:

04/22/23 University of Wisconsin-Madison40

Bit-selection(inexpensive, low quality)

H3 [Carter, CSS79](moderate, higher quality)