anshul kumar, cse iitd csl718 : memory hierarchy cache memories 6th feb, 2006

26
Anshul Kumar, CSE IITD CSL718 : Memory CSL718 : Memory Hierarchy Hierarchy Cache Memories 6th Feb, 2006

Upload: camron-cobb

Post on 01-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD

CSL718 : Memory HierarchyCSL718 : Memory HierarchyCSL718 : Memory HierarchyCSL718 : Memory Hierarchy

Cache Memories

6th Feb, 2006

Page 2: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 2

Memory technologiesMemory technologiesMemory technologiesMemory technologies

• Semiconductor– Registers– SRAM Random Access– DRAM– FLASH

• Magnetic– FDD– HDD

• Optical Random + sequential– CD– DVD

Page 3: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 3

Hierarchical structureHierarchical structureHierarchical structureHierarchical structure

Memory

CPU

Memory

Size Cost / bitSpeed

Smallest

Biggest

Highest

Lowest

Fastest

Slowest Memory

Page 4: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 4

System Configuration: e-bay price: Rs. 37,500

Processor: Intel P4 3.2GHz (800FSB) 1024k CPU with Hyper Threading

CPU Fan: P4 Heavy Duty Cooling Fan With Heat Sink

Motherboard: D915G express chipset 800FSB  (up to 3.6GHz support)

Memory: 1GB DDR400 PC3200 DUAL CHANNEL RAM

Video Card: GeForce FX 6200 256MB 16x PCI-e video with TV out

Hard drive: 160GB 7200RPM UDMA-150 SATA

CD drive: 52x32x52x16x CDRW + DVD ROM drive 

Floppy drive: Sony 1.44MB 3.5" drive

Sound: AC 97 6 ch 5.1 Full duplex digital sound, stereo speakers

Network: 10/100 RJ45 onboard network (Ethernet, cable or DSL)

Modem: 56k v92 modem 

Ports: Six USB 2.0 ports,1 serial, 1 parallel, 1 microphone jack

Case: Black i BOX 522 Mid Tower 400w power supply (front USB)

Keyboard: Black PS2 Windows Keyboard

Mouse: Black PS2 Scroll Mouse

Monitor: 17" SAMSUNG 793S MONITOR

Page 5: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 5

Main Memory for Pentium IVMain Memory for Pentium IVDDR (double data rate) DRAMDDR (double data rate) DRAMMain Memory for Pentium IVMain Memory for Pentium IVDDR (double data rate) DRAMDDR (double data rate) DRAM

Size Interface Price

128 MB PC-333 Rs. 599

256 MB PC-333 Rs. 1,299

1 GB PC-333 Rs. 4,999

1 GB PC-400 Rs, 5,299

Page 6: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 6

Disk drives Disk drives Seagate Baracuda 7200 RPMSeagate Baracuda 7200 RPM

Disk drives Disk drives Seagate Baracuda 7200 RPMSeagate Baracuda 7200 RPM

Capacity Price40 GB Rs. 2,999

80 GB Rs. 3,499

120 GB Rs. 4,499

160 GB Rs. 4,799

200 GB Rs. 5,500

250 GB Rs. 6,999

300 GB Rs. 9,900

400 GB Rs. 14,950

Page 7: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 7

Data transfer between levelsData transfer between levelsData transfer between levelsData transfer between levels

unit of transfer = block

access

hit

miss

Processor

Data transfer

Page 8: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 8

Principle of localityPrinciple of localityPrinciple of localityPrinciple of locality

• Temporal Locality– references repeated in time

• Spatial Locality– references repeated in space– Special case: Sequential Locality

Page 9: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 9

Memory Hierarchy AnalysisMemory Hierarchy AnalysisMemory Hierarchy AnalysisMemory Hierarchy Analysis

Memory Mi: M1, M2, …. , Mn

Capacity si: s1< s2< …. < sn

Unit cost ci: c1> c2> …. > cn

Total cost Ctotal: i ci . si

Access time ti : 1+ 2+ …. +i (i at level i)

1< 2< …. < n

Hit ratios hi(si): h1< h2< …. < hn = 1

Effective time Teff: i mi . hi . ti = i mi . i

Miss before level i, mi: (1-h1)(1-h2) …. (1-hi-1)

Page 10: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 10

Cache TypesCache TypesCache TypesCache Types

Instruction | Data | Unified | Split

Split vs. Unified:

• Split allows specializing each part

• Unified allows best use of the capacity

On-chip | Off-chip• on-chip : fast but small

• off-chip : large but slow

Single level | Multi level

Page 11: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 11

Cache PoliciesCache PoliciesCache PoliciesCache Policies

• Placement what gets placed where?

• Read when? from where?

• Load order of bytes/words?

• Fetch when to fetch new block?• Replacement which one?

• Write when? to where?

Page 12: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 12

Block placement strategiesBlock placement strategiesBlock placement strategiesBlock placement strategies

12

Tag

Data

Block # 0 1 2 3 4 5 6 7

Search

Direct mapped

12

Tag

Data

Set # 0 1 2 3

Search

Set associative

12

Tag

Data

Search

Fully associative

Page 13: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 13

Organization/placement policyOrganization/placement policyOrganization/placement policyOrganization/placement policy

Set 1

Set S

Sector 1 Sector 2 Sector SE LRU

Block 1 Block 2 Block BTag

AU 1 AU 2 AU AV D S

Cache

Set

Sector

Block

Page 14: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 14

Addressing CacheAddressing CacheAddressing CacheAddressing Cache

Sector Name Set Index Block Displacement Address

Selects set

Compared to TagsSelectsBlock

Selects AU

Early select: access data after tag matchingLate select: access data while tag matching

Page 15: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 15

Cache organization exampleCache organization exampleCache organization exampleCache organization example

Tag V D AU AU V D AU AU Tag V D AU AU V D AU AU1

2

3

4

5

6

7

8

Block Block Block Block

Sector Sector

Sets

Page 16: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 16

Cache access mechanismCache access mechanismCache access mechanismCache access mechanism

=

index v tag data01

...

...

409518 32

Hit Data

Address31 0

Tag 18 12

index

2byte

offset

Page 17: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 17

Cache with 4 word blocksCache with 4 word blocksCache with 4 word blocksCache with 4 word blocks

=

index v tag data01

...

...

102318 32

Hit Data

Address31 0

Tag 18 10

index

2byte offset

3232 32

Mux

2block offset

Page 18: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 18

4-way set associative cache4-way set associative cache4-way set associative cache4-way set associative cache

0.........

255

Hit

Data

31 0

tag 20 8index

2 byte offset

Mux

2block offset

=

20 128

v tag data

=

20 128

v tag data

=

20 128

v tag data

=

20 128

v tag data

Mux Mux Mux Mux32 32 32 32

Page 19: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 19

Read policiesRead policiesRead policiesRead policies

• Sequential or concurrent– initiate memory access only after detecting a

miss– initiate memory access along with cache access

in anticipation of a miss

• With or without forwarding– give data to CPU after filling the missing block

in cache– forward data to CPU as it gets filled in cache

Page 20: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 20

Read PoliciesRead PoliciesRead PoliciesRead Policies

CacheMemory

Teff=(1-pm).1 + pm . (T+2)

Sequential Simple:

CacheMemory

Teff=(1-pm).1 + pm . (T+1)

Concurrent Simple:

CacheMemory

Teff=(1-pm).1 + pm . (T+1)

Sequential Forward:

CacheMemory

Teff=(1-pm).1 + pm . (T)

Concurrent Forward:

1 1 1T

1 1 1T

1 1T

1 1T

Page 21: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 21

Load policiesLoad policiesLoad policiesLoad policies

4 AU Block

Cache miss on AU 1

Block Load

Load Forward

Fetch Bypass(wrap aroundload)

0 1 2 3

Page 22: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 22

Fetch PoliciesFetch PoliciesFetch PoliciesFetch Policies

• Fetch on miss (demand fetching)

• Software prefetching

• Hardware Prefetching

Page 23: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 23

Fetch PoliciesFetch PoliciesFetch PoliciesFetch Policies

• Demand fetching– fetch only when required (miss)

• Hardware prefetching– automatically prefetch next block

• Software prefetching– programmer decides to prefetch

questions: – how much ahead (prefetch distance)– how often

Page 24: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 24

Software Control of CacheSoftware Control of CacheSoftware Control of CacheSoftware Control of Cache

Software visible cache– mode selection (WT, WB etc)– block flush– block invalidate– block prefetch

Page 25: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 25

Replacement PoliciesReplacement PoliciesReplacement PoliciesReplacement Policies

• Least Recently Used (LRU)

• Least Frequently Used (LFU)

• First In First Out (FIFO)

• Random

Page 26: Anshul Kumar, CSE IITD CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006

Anshul Kumar, CSE IITD slide 26

Write PoliciesWrite PoliciesWrite PoliciesWrite Policies

• Write Hit– Write Back– Write Through

• Write Miss– Write Back– Write Through (with or without Write Allocate)

Buffers are used in all cases to hide latencies