cache - swarthmore collegecpu chip cache recall: how memory read works (1)cpu places address a on...
TRANSCRIPT
![Page 1: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/1.jpg)
Cache10/27/16
![Page 2: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/2.jpg)
TheMemoryHierarchy
Localsecondarystorage(disk)
LargerSlowerCheaperperbyte
Remotesecondarystorage(tapes,thecloud)
~100Mcyclestoaccess
OnChip
Storage
SmallerFasterCostlierperbyte
Mainmemory(DRAM)
~100cyclestoaccess
CPUinstrscan
directlyaccess
evenslowerthandisk
Registers1cycletoaccess
Cache(s)(SRAM)
~10’sofcyclestoaccess
FlashSSD/Localnetwork
![Page 3: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/3.jpg)
0.0
0.1
1.0
10.0
100.0
1,000.0
10,000.0
100,000.0
1,000,000.0
10,000,000.0
100,000,000.0
1980 1985 1990 1995 2000 2003 2005 2010
ns (1
0-9 s
ec)
Year
Disk seek timeFlash SSD access timeDRAM access timeSRAM access timeCPU cycle timeEffective CPU cycle time
3
DataAccessTimeoverYearsOvertime,gapwidensbetweenDRAM,disk,andCPUspeeds.
Disk
DRAM
CPU
SSD
SRAM
multicore
Reallywanttoavoidgoingtodiskfordata
WanttoavoidgoingtoMainMemoryfordata
![Page 4: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/4.jpg)
Recall
• Acacheisasmaller,fastermemory,thatholdsasubsetofalarger(slower)memory
• Wetakeadvantageoflocality tokeepdataincacheasoftenaswecan!
• Whenaccessingmemory,wecheckcachetoseeifithasthedatawe’relookingfor.
![Page 5: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/5.jpg)
Whycachemissesoccur
• Compulsory(cold-start)miss:• Firsttimeweusedata,loaditintocache.
• Capacitymiss:• Cacheistoosmalltostoreallthedatawe’reusing.
• Conflictmiss:• Tobringinnewdatatothecache,weevictedotherdatathatwe’restillusing.
![Page 6: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/6.jpg)
Cachedesign
Questions:• Whatdatashouldbebroughtintothecache?• Whereinthecacheshoulditgo?• Whatdatashouldbeevictedfromthecache?
Goals:• Maximizehitrate.• Takeadvantageoftemporalandspatiallocality.• Minimizehardwarecomplexity.
![Page 7: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/7.jpg)
CachingTerminology• Block:thesizeofasinglecachedatastorageunit
• Datagetstransferredintocacheinentireblocks(nopartialblocks).• Lowerlevelsmayhavelargerblocksizes.
• Line:asinglecacheentry:• data(block)+identifyinginformation+otherstate
• Hit:thesoughtdataarefoundinthecache.• L1:typically~95%hitrate
• Miss:thesoughtdataarenotfoundinthecache.• Fetchfromlowerlevels.
• Replacement:Movingavalueoutofacachetomakeroomforanewvalueinitsplace
7
Blockissome#ofbytes
(fromcontiguousmem.addrs)
![Page 8: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/8.jpg)
CachebasicsLine metadata addressinfo datablock
0
1
2
3
… …
1021
1022
1023
Eachlinestoressomedata,plusinformationaboutwhatmemoryaddressthedatacamefrom.
![Page 9: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/9.jpg)
SupposetheCPUasksfordata,it’snotincache.Weneedtomoveinintocachefrommemory.Whereinthecacheshoulditbeallowedtogo?
A. Inexactlyoneplace.
B. Inafewplaces.
C. Inmostplaces,butnotall.
D. Anywhereinthecache.
ALURegs
Cache
MainMemory
MemoryBus
CPU
? ?
?
![Page 10: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/10.jpg)
A. Inexactlyoneplace.(“Direct-mapped”)• Everylocationinmemoryisdirectlymappedtooneplace
inthecache.Easytofinddata.
B. Inafewplaces.(“Setassociative”)• Amemorylocationcanbemappedto(2,4,8)locationsin
thecache.Middleground.
C. Inmostplaces,butnotall.
D. Anywhereinthecache.(“Fullyassociative”)• Norestrictionsonwherememorycanbeplacedinthe
cache.Fewerconflictmisses,moresearching.
![Page 11: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/11.jpg)
Alargerblocksize(cachingmemoryinlargerchunks)islikelytoexhibit…A. Bettertemporallocality
B. Betterspatiallocality
C. Fewermisses(betterhitrate)
D. Moremisses(worsehitrate)
E. Morethanoneoftheabove.(Which?)
![Page 12: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/12.jpg)
BlockSizeImplications• Smallblocks
• Roomformoreblocks• Fewerconflictmisses
• Largeblocks• Fewertripstomemory• Longertransfertime• Fewercold-startmisses
MainMemory MainMemory
Cache Cache
ALURegs ALURegs
![Page 13: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/13.jpg)
Trade-offs
• Thereisnosinglebestdesignforallpurposes!
• Commonsystemsquestion:whichpointinthedesignspaceshouldwechoose?
• Givenaparticularscenario:• Analyzeneeds• Choosedesignthatfitsthebill
![Page 14: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/14.jpg)
RealCPUs• Goals:generalpurposeprocessing
• balanceneedsofmanyusecases• middleoftheroad:jackofalltrades,masterofnone
• Someassociativity• 8-wayassociative(memoryinoneofeightplaces)
• Mediumsizeblocks• 16or32-byteblocks
![Page 15: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/15.jpg)
Whatshouldweusetodeterminewhetherornotdataisinthecache?A. Thememoryaddressofthedata.
B. Thevalueofthedata.
C. Thesizeofthedata.
D. Someotheraspectofthedata.
![Page 16: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/16.jpg)
Recall:HowMemoryReadWorks
(1) CPUplacesaddressAonthememorybus.
ALU
Registerfile
Businterface
0
Ax
MainmemoryI/Obridge
%eax
Loadoperation: movl (A), %eaxCPUchip
Cache
![Page 17: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/17.jpg)
Recall:HowMemoryReadWorks
(1) CPUplacesaddressAonthememorybus.(2)Memorysendsbackthevalue
ALU
Registerfile
Businterface
0
Ax
MainmemoryI/Obridge
%eax
Loadoperation: movl (A), %eaxCPUchip
Cache
![Page 18: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/18.jpg)
MemoryAddressTellsUs…
• Istheblockcontainingthebyte(s)youwantalreadyinthecache?
• Ifnot,whereshouldweputthatblock?• Doweneedtokickout(“evict”)anotherblock?
• Whichbyte(s)withintheblockdoyouwant?
![Page 19: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/19.jpg)
MemoryAddresses
• Likeeverythingelse:seriesofbits(32or64)
• Keepinmind:• Nbitsgivesus2N uniquevalues.
• 32-bitaddress:• 10110001011100101101010001010110
Divideintoregions,eachwithdistinctmeaning.
![Page 20: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/20.jpg)
FirstDirect-Mapped
• Oneplacedatacanbe.
• Example:let’sassumesomeparameters:• 1024cachelocations(everyblockmappedtoone)• Blocksizeof8bytes
![Page 21: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/21.jpg)
Direct-MappedLine V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020
1021
1022
1023
Metadata
![Page 22: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/22.jpg)
CacheMetadata
• Validbit:istheentryvalid?• Ifset:dataiscorrect,useitifwe‘hit’incache• Ifnot set:ignore‘hits’,thedataisgarbage
• Dirtybit:hasthedatabeenwritten?• Usedbywrite-backcaches• Ifset,needtoupdatememorybeforeeviction
![Page 23: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/23.jpg)
Direct-Mapped• Addressdivision:
• Identifybyteinblock• Howmanybits?
• Identifywhichrow(line)• Howmanybits?
Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020
1021
1022
1023
![Page 24: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/24.jpg)
Direct-Mapped• Addressdivision:
• Identifybyteinblock• Howmanybits?3
• Identifywhichrow(line)• Howmanybits?10
Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020
1021
1022
1023
![Page 25: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/25.jpg)
Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020
1021
1022
1023
Index:Whichline(row)shouldwecheck?Wherecoulddatabe?
Tag(19bits) Index(10bits) Byteoffset (3bits)
![Page 26: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/26.jpg)
Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020
1021
1022
1023
Index:Whichline(row)shouldwecheck?Wherecoulddatabe?
Tag(19bits) Index(10bits) Byteoffset (3bits)
4
![Page 27: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/27.jpg)
Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4 1 4217
… …
1020
1021
1022
1023
Inparallel,check:
Tag:Doesthecacheholdthedatawe’relookingfor,orsomeotherblock?
Validbit:Ifentryisnotvalid,don’ttrustgarbageinthatline(row).
Tag(19bits) Index(10bits) Byteoffset (3bits)
4217 4
Iftagdoesn’tmatch,orlineisinvalid,it’samiss!
![Page 28: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/28.jpg)
Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4 1 4217
… …
1020
1021
1022
1023
Byteoffsettellsuswhichsubsetofblocktoretrieve.
Tag(19bits) Index(10bits) Byteoffset (3bits)
4217 4
0 1 2 3 4 5 6 7
![Page 29: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/29.jpg)
Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4 1 4217
… …
1020
1021
1022
1023
Byteoffsettellsuswhichsubsetofblocktoretrieve.
Tag(19bits) Index(10bits) Byteoffset (3bits)
4217 4 2
0 1 2 3 4 5 6 7
![Page 30: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/30.jpg)
V D Tag Data
…
=
Tag Index Byteoffset
0:miss1:hit
SelectByte(s)
Data
Input:MemoryAddress
![Page 31: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/31.jpg)
Direct-MappedExample• Supposeouraddressesare16bitslong.
• Ourcachehas16entries,blocksizeof16bytes• 4bitsinaddressfortheindex• 4bitsinaddressforbyteoffset• Remainingbits(8):tag
![Page 32: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/32.jpg)
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 0110101100110100
• Step1:• Partitionaddressintotag,index,offset
Line V D Tag Data(16Bytes)
0
1
2
3
4
5
…
15
![Page 33: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/33.jpg)
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 0110101100110100
• Step1:• Partitionaddressintotag,index,offset
Line V D Tag Data(16Bytes)
0
1
2
3
4
5
…
15
![Page 34: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/34.jpg)
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 011010110011 0100
• Step2:• Useindextofindline(row)
• 0011->3
Line V D Tag Data(16Bytes)
0
1
2
3
4
5
…
15
![Page 35: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/35.jpg)
Line V D Tag Data(16Bytes)
0
1
2
3
4
5
…
15
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 011010110011 0100
• Step2:• Useindextofindline(row)
• 0011->3
![Page 36: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/36.jpg)
Line V D Tag Data(16Bytes)
0
1
2
3
4
5
…
15
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 011010110011 0100
• Note:• ANYaddresswith0011(3)asthemiddlefourindexbitswillmaptothiscacheline.
• e.g.111111110011 0000
So,whichdataishere?
Datafromaddress0110101100110100OR1111111100110000?
Usetagtostorehigh-orderbits.Let’susdeterminewhichdataishere!(manyaddressesmaphere)
![Page 37: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/37.jpg)
Line V D Tag Data(16Bytes)
0
1
2
3 01101011
4
5
…
15
Direct-MappedExample
• Let’ssayweaccessmemoryataddress:
• 011010110011 0100
• Step3:• Checkthetag• Isit01101011(hit)?• Somethingelse(miss)?• (Mustalsoensurevalid)
![Page 38: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/38.jpg)
Eviction
• Ifwedon’tfindwhatwe’relookingfor(miss),weneedtobringinthedatafrommemory.
• Makeroombykickingsomethingout.• Iflinetobeevictedisdirty,writeittomemoryfirst.
• Anotherimportantsystemsdistinction:• Mechanism:Anabilityorfeatureofthesystem.Whatyoucan do.
• Policy:Governsthedecisionsmakingforusingthemechanism.Whatyoushould do.
![Page 39: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/39.jpg)
Evictionfordirect-mappedcache
• Mechanism:overwritebitsincacheline,updating• Validbit• Tag• Data
• Policy:notmanyoptionsfordirect-mapped• Overwriteattheonlylocationitcouldbe!
![Page 40: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/40.jpg)
Eviction:Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020 1 0 1323 57883
1021
1022
1023
Findline:
Tagdoesn’tmatch,bringinfrommemory.
Ifdirty,writebackfirst!
Tag(19bits) Index(10bits) Byteoffset (3bits)
3941 1020
![Page 41: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/41.jpg)
Eviction:Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020 1 0 1323 57883
1021
1022
1023
Tag(19bits) Index(10bits) Byteoffset (3bits)
3941 1020
MainMemory
1.Sendaddresstoreadmainmemory.
![Page 42: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/42.jpg)
Eviction:Direct-Mapped• Addressdivision: Line V D Tag Data(8Bytes)
0
1
2
3
4
… …
1020 1 0 3941 92
1021
1022
1023
Tag(19bits) Index(10bits) Byteoffset (3bits)
3941 1020
MainMemory
1.Sendaddresstoreadmainmemory.
2.Copydatafrommemory.Updatetag.
![Page 43: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/43.jpg)
Supposewehad8-bitaddresses,acachewith8lines,andablocksizeof4bytes.
• Howmanybitswouldweusefor:• Tag?• Index?• Offset?
![Page 44: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/44.jpg)
Howmanyoftheseoperationschangethecache?Howmanyaccessmemory?
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 9
20 0 101 15
31 1 001 8
41 0 011 4
50 0 111 6
60 0 101 32
71 0 110 3
A. 1B. 2C. 3
D. 4E. 5
![Page 45: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/45.jpg)
Steppingthrough…
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 010 9 5
20 0 101 15
31 1 001 8
41 0 011 4
50 0 111 6
60 0 101 32
71 0 110 3
![Page 46: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/46.jpg)
Steppingthrough…
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 010 9 5
20 0 101 15
31 1 001 8
41 0 011 4
50 0 111 6
60 0 101 32
71 0 110 3Nochangenecessary.
![Page 47: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/47.jpg)
Steppingthrough…
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 010 9 5
20 0 101 15
31 1 001 8
41 0
1011 4 7
50 0 111 6
60 0 101 32
71 0 110 3
![Page 48: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/48.jpg)
Steppingthrough…
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 010 9 5
201
0 101 101 15 12
31 1 001 8
41 0
1011 4 7
50 0 111 6
60 0 101 32
71 0 110 3
Note:taghappenedtomatch,butlinewasinvalid.
![Page 49: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/49.jpg)
Steppingthrough…
Read01000100(Value:5)Read11100010(Value:17)Write01110000(Value:7)Read10101010(Value:12)Write01101100(Value:2)
Line V D Tag Data(4Bytes)
01 0 111 17
11 0 011 010 9 5
201
0 101 101 15 12
31 1
1001 011 8 2
41 0
1011 4 7
50 0 111 6
60 0 101 32
71 0 110 3
1. Writedirtylinetomemory.2. Loadnewvalue,setitto2,
markitdirty(write).
![Page 50: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/50.jpg)
Question…
Whenmightdirect-mappedcachebeabadidea?
Whentwoblocksweusealothavethesameindex.
![Page 51: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/51.jpg)
Theotherextreme:fullyassociative
+Anyblockcangoinanycacheline.+Reducescachemisses.
- Havetocheckeverylineformatchingaddress.- Needtostoremorebitsoftheaddress.- Evictiondecisionsareharder.
![Page 52: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/52.jpg)
Compromise:setassociative
• EachlinecanholdNblocks.• Addressesaremappedtoaline,butcangoinanyofthatline’sNblocks.
![Page 53: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/53.jpg)
Comparison:1024Lines(Forthesamecachesize,inbytesofdata.)
Direct-mapped
1024indices(10bits)
2-waysetassociative
512sets(9bits)Tagis1bitlarger.V D Tag Data(8Bytes)
…
Set # V D Tag Data(8Bytes)
0
1
2
3
4
… …
508
509
510
511
![Page 54: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/54.jpg)
2-WaySetAssociative
V D Tag Data(8Bytes)
1 0 3941
…
Set # V D Tag Data(8Bytes)
0
1
2
3
4 1 1 4063
… …
508
509
510
511
Tag(20bits) Set(9bits) Byteoffset (3bits)
3941 4Samecapacityaspreviousexample:1024rowswith1entryvs.512rowswith2entries
![Page 55: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/55.jpg)
2-WaySetAssociative
V D Tag Data(8Bytes)
1 0 3941
…
Set # V D Tag Data(8Bytes)
0
1
2
3
4 1 1 4063
… …
508
509
510
511
Tag(20bits) Set(9bits) Byteoffset (3bits)
3941 4
Checkalllocationsintheset,inparallel.
![Page 56: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/56.jpg)
2-WaySetAssociative
V D Tag Data(8Bytes)
1 0 3941
…
Set # V D Tag Data(8Bytes)
0
1
2
3
4 1 1 4063
… …
508
509
510
511
Tag(20bits) Set(9bits) Byteoffset (3bits)
3941 4
0 1 2 3 4 5 6 70 1 2 3 4 5 6 7
Multiplexer Selectcorrectvalue.
![Page 57: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/57.jpg)
4-WaySetAssociativeCache
Clearly,morecomplexityhere!
![Page 58: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/58.jpg)
Eviction• Mechanismisthesame…
• Overwritebitsincacheline:updatetag,valid,data
• Policy:choosewhichlineinthesettoevict• Option1:Pickarandomlineinset• Option2:Chooseaninvalidlinefirst• Option3:Choosetheleastrecentlyusedblock
• Hasexhibitedtheleastlocality,kickitout!• Option4:first2then3
![Page 59: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/59.jpg)
LeastRecentlyUsed(LRU)
• Intuition:ifithasn’tbeenusedinawhile,wehavenoreasontobelieveitwillbeusedsoon.
• NeedextrastatetokeeptrackofLRUinfo.
V D Tag Data(8Bytes)
1 0 3941
…
Set # LRU V D Tag Data(8Bytes)
0 0
1 1
2 1
3 0
4 1 1 1 4063
… …
![Page 60: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/60.jpg)
LeastRecentlyUsed(LRU)• Intuition:ifithasn’tbeenusedinawhile,wehavenoreasontobelieveitwillbeusedsoon.
• NeedextrastatetokeeptrackofLRUinfo.
• ForperfectLRUinfo:• 2-way:1bit• 4-way:8bits• N-way:N*log2 Nbits
Anotherreasonwhyassociativityoftenmaxesoutat8or16.
Thesearemetadatabits,not“useful”programdatastorage.
(Approximationsmakeitnotquiteasbad.)
![Page 61: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/61.jpg)
Howwouldthecachechangeifweperformedthefollowingmemoryoperations?(2-wayset)Read01000100(Value:5)Read11100010(Value:17)Write01100100(Value:7)Read01000110(Value:5)Write01100000(Value:2)
V D Tag Data(4Bytes)
1 0 001 17
1 0 010 5
… …
Set # LRU V D Tag Data(4Bytes)
0 1 0 0 111 4
1 0 1 1 111 9
2 … …
3
4
5
6
7
LRUof0meanstheleftlineinthesetwasleastrecentlyused.1meanstherightlinewasusedleastrecently.
![Page 62: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/62.jpg)
CacheConsciousProgramming• Knowingaboutcachinganddesigningcodearounditcansignificantlyeffectperformance(ex)2Darrayaccesses
Algorithmically,bothO(N*M).
Isonefasterthantheother?
for(i=0; i < N; i++) {for(j=0; j< M; j++) {
sum += arr[i][j];}}
for(j=0; j < M; j++) {for(i=0; i< N; i++) {
sum += arr[i][j];}}
A.isfaster. B.isfaster.
C.Bothwouldexhibitroughlyequalperformance.
![Page 63: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/63.jpg)
CacheConsciousProgrammingThefirstnestedloopismoreefficientifthecacheblocksizeislargerthanasinglearraybucket(forarraysofbasicCtypes,itwillbe).
(ex)1missevery4bucketsvs.1misseverybucket
for(i=0; i < N; i++) {for(j=0; j< M; j++) {
sum += arr[i][j];}}
for(j=0; j < M; j++) {for(i=0; i< N; i++) {
sum += arr[i][j];}}
1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
...
.
.
.
1 . ..
2
3
4
.
.
.
![Page 64: Cache - Swarthmore CollegeCPU chip Cache Recall: How Memory Read Works (1)CPU places address A on the memory bus. (2)Memory sends back the value ALU Register file Bus interface 0 x](https://reader035.vdocuments.us/reader035/viewer/2022062414/5ede471ead6a402d66699991/html5/thumbnails/64.jpg)
Acaveat:Amdahl’sLawIdea:anoptimizationcanimprovetotalruntimeatmostbythefractionitcontributestototalruntime
Ifprogramtakes100secs torun,andyouoptimizeaportionofthecodethataccountsfor2%oftheruntime,thebestyouroptimizationcandoisimprovetheruntimeby2secs.
Amdahl’sLawtellsustofocusouroptimizationeffortsonthecodethatmatters:
Speed-upwhatisaccountingforthelargestportionofruntime togetthelargestbenefit.And,don’twastetimeonthesmallstuff.
“Prematureoptimizationistherootofallevil.”–DonaldKnuth