caches in systems comp25212 cache 4. learning objectives to understand: –“3 x c’s” model of...

16
Caches in Systems COMP25212 Cache 4

Upload: bryce-bond

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Caches in Systems

Page 2: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Learning Objectives

• To understand:– “3 x C’s” model of cache performance

– Time penalties for starting with empty cache

– Systems interconnect issues with caching– and solutions!

– Caching and Virtual Memory

Page 3: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Describing Cache Misses

• Compulsory Misses

• Capacity Misses

• Conflict Misses

Page 4: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Cache Performance again

• Today’s caches, how long does it take:a) to fill L3 cache? (8MB)

b) to fill L2 cache? (256KB)

c) to fill L1 D cache? (32KB)

• Number of lines = (cache size) / (line size)• Number of lines = 32K/64 = 512• 512 x memory access times at 20nS = 10 uS• 20,000 clock cycles at 2GHz

Page 5: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Caches in Systems

e.g. disk, network

L1Data

Cache

CPU

RAMMemory

On-chip

L1Inst

Cachefetch

data

L2

Input/Output

how often?internconnect

stuff

Page 6: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Cache Consistency Problem 1

• Problem:– I/O writes to mem;

cache outdated

DataCache

CPU

RAMMemory

On-chip

L1Inst

Cachefetch

data

L2

Input/Output

internconnect

stuff

“I”

“I”

“I”

Page 7: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Cache Consistency Problem 2

COMP25212 Cache 4

DataCache

CPU

RAMMemory

On-chip

L1Inst

Cachefetch

data

L2

Input/Output

internconnect

stuff

“I”

“I”

“I”

• Problem:– I/O reads mem;

cache holds newer

Page 8: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Cache Consistency Software Solutions

• O/S knows where I/O takes place in memory– Mark I/O areas as non-cachable (how?)

• O/S knows when I/O starts and finishes– Clear caches before&after I/O?

Page 9: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Hardware Solutions:1

Unfortunately:

tends to slow down cache COMP25212 Cache 4

DataCache

CPU

RAMMemory

On-chip

L1Inst

Cachefetch

data

L2

Input/Output

internconnect

stuff

“I”

“I”

“I”

Page 10: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

Hardware Solutions: 2 - Snooping

COMP25212 Cache 4COMP25212 Cache 4

DataCache

CPU

RAMMemory

On-chip

L1Inst

Cachefetch

data

L2

Input/Output

internconnect

stuff

“I”

“I”

“I”

Snoop logic in cache observes every memory cycle

snoo

p

L2 keeps track of L1 contents

Page 11: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Caches and Virtual Addresses

• CPU addresses – virtual

• Memory addresses – physical

• Recap – use TLB to translate v-to-p

• What addresses in cache?

Page 12: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Option 1: Cache by Physical Addresses

CPU

RAMMemory

On-chip

address

data

$

TLB

• BUT:– Address translation

in series with cacheSLOW

Page 13: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

Option 2: Cache by Virtual Addresses

COMP25212 Cache 4

CPU

RAMMemory

On-chip

address

data

$

TLB

• BUT:– Snooping?– Aliasing?

More Functional Difficulties

Page 14: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

3: Translate in parallel with Cache Lookup

• Translation only affects high-order bits of address• Address within page remains unchanged

Low-order bits of Physical Address

= low-order bits of Virtual Address

Select “index” field of cache address from within low-order bits

Only “Tag” bits changed by translation

COMP25212 Cache 4

Page 15: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

Option 3 in operation:

within line

indexvirtual page no

7520

Virtual address

data linetag line

multiplexer

compare = ?

TLB

Physical address

Hit? Data

Page 16: Caches in Systems COMP25212 Cache 4. Learning Objectives To understand: –“3 x C’s” model of cache performance –Time penalties for starting with empty

COMP25212 Cache 4

The Last Word on Caching?

RAMMemory

On-chip

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L3

Input/Output

On-chip

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L3

On-chip

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L1Data

Cache

CPUL1Inst

Cachefetch

data

L2

L3

You ain’t seen nothing yet!