openrisc 1000 yung-luen lan, b9090102. cache model perspective of the programming model. hence, the...

Post on 04-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

OpenRISC 1000

Yung-Luen Lan, b9090102

Cache Model

• Perspective of the Programming Model.

• Hence, the hardware implementation details (cache organization. size, ... etc.) that are invisible to this topic.

• The implementation should work without cache.

Why Cache?

• Memory Hierarchy

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cache Concept

• How do we know if a data item is in the cache?

• If it is, how do we find them?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Map to memory

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cache Coherency

• Multiple Processor

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Multiprocessor Cache Coherency

• Extra snooping hardware to detect the action of other processors.

• Snoopy protocol– Write-invalidate– Write-update

Snoopy Protocol

• Write-invalidate: The writing processor causes all copies in other caches to be invalidated before changing its local copy. The writing processor issue an invalidation signal over the bus, and all caches check to see if they have a copy. If so, they must invalidate the block.

• Write-update: The writing processor broadcast the new data over the bus, and all copies are then updated with the new value.

Example

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Example(cont.)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Cache Special-Purpose Register

DCCR

• Data Cache Control Register

• Accessible with the l.mtspr/l.mfspr instructions in supervisor mode.

• Enable Ways:

• 0000 0000 All ways disabled/lock

• 1111 1111 All ways enabled/unlock

ICCR

• Instruction Cache Control Register

• Accessible with the l.mtspr/l.mfspr instructions in supervisor mode.

• Enable Ways:

• 0000 0000 All ways disabled/lock

• 1111 1111 All ways enabled/unlock

Cache Control Register (cont.)

• If the cache does not implement way locking, the DCCR/ICCR is not required to be implemented.

Cache Management

• Memory accesses caused by cache management are not recorded (unlike load or store instructions) and cannot invoke any exception.

• Instruction caches do not need to be coherent with the memory or caches of other processors. Software must make the instruction cache coherent with modified instructions in the memory. A typical way to accomplish this is:– Data cache block write-back (update of the memory)– l.csync (wait for update to finish)– Instruction cache block invalidate (clear instruction cache block)– Flush pipeline

Data Cache Block Prefetch

• Optional special-purpose register.• accessible with the l.mtspr/l.mfspr instructions in

both user and supervisor modes.• 32bits/64bits• Optional• The DCBPR is written with the effective address

and the corresponding block from memory is prefetched into the cache. (write only)

Data Cache Block Flush

• The DCBFR is written with the effective address. • If coherency is required then the corresponding:

– Unmodified data cache block is invalidated in all processors.– Modified data cache block is written back to the memory and inv

alidated in all processors.– Missing data cache block in the local processor causes that mod

ified data cache block in other processor is written back to the memory and invalidated. If other processors have unmodified data cache block, it is just invalidated in all processors.

• If coherency is not required then the corresponding:– Unmodified data cache block in the local processor is invalidated.– Modified data cache block is written back to the memory and inv

alidated in local processor.– Missing cache block in the local processor does not cause any a

ction.

Data Cache Block Invalidate

• The DCBIR is written with the effective address. If coherency is required then the corresponding:– Unmodified data cache block is invalidated in all processors.– Modified data cache block is invalidated in all processors.– Missing data cache block in the local processor causes that data

cache blocks in other processors are invalidated.

• If coherency is not required then corresponding:– Unmodified data cache block in the local processor is invalidated.– Modified data cache block in the local processor is invalidated.– Missing cache block in the local processor does not cause any a

ction.

Data Cache Block Write-Back

• The DCBWR is written with the effective address.

• If coherency is required then:– the corresponding data cache block in any of t

he processors is written back to memory if it was modified.

• If coherency is not required then:– the corresponding data cache block in the loc

al processor is written back to memory if it was modified.

Data Cache Block Lock

• Optional

• The DCBLR is written with the effective address. The corresponding data cache block in the local processor is locked.

• If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block.

Instruction Cache Block Prefetch

• Optional

• The ICBPR is written with the effective address and the corresponding block from memory is prefetched into the instruction cache.

Instruction Cache Block Invalidate

• The ICBIR is written with the effective address.

• If coherency is required then the corresponding instruction cache blocks in all processors are invalidated.

• If coherency is not required then the corresponding instruction cache block is invalidated in the local processor.

Instruction Cache Block Lock

• Optional• The ICBLR is written with the effective address.

The corresponding instruction cache block in the local processor is locked.

• If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block.

• Missing cache block in the local processor does not caus e any action.

Comparison

Pre-fetch Flush Invalidate Write Back

Lock

Data Optional YES YES YES Optional

Instruction Optional NO YES NO Optional

Cache/Memory Coherency• synchronize.• In systems that do not provide cache coherency with the PTE attribu

tes (because they do not implement a memory management unit), it may be provided through explicit cache management.

• Cache coherency in systems with virtual memory can be provided on a page-by-page basis with PTE attributes. The attributes are:– Cache Coherent (CC Attribute)– Caching-Inhibited (CI Attribute)– Write-Back Cache (WBC Attribute)

• When the memory/cache attributes are changed, it is imperative that the cache contents should reflect the new attribute settings. This usually means that cache blocks must be flushed or invalidated.

Pages Designated as Cache Coherent Pages

• CC=0: do not need cache coherency.

• CC=1: need cache coherency.

• To improve performance of uniprocessor systems, memory pages should not be designated as CC=1.

Pages Designated as Caching-Inhibited Pages

• CI=1, Memory accesses was directly into the main memory, without any cache.

• The target content should never be available in the cache.

• When OS sets a page CI, it should flush the corresponding cache block to prevent the accident copy.

Pages Designated as Write-Back Cache Pages

• WBC=0: Write to both cache and memory.

• WBC=1: Only write to local cache. (Requires cache snooping hardware support)

Debug Unit (optional)

ª›®œ•Œ QuickTime˛ ©M°ßTIFF (LZW)°®∏—¿£¡Yæπ

®”¿Àµ¯¶ππœµe°C

Power Management (optional)

• Slow down feature

• Doze mode

• Sleep mode

• Suspend mode

• Dynamic clock gating feature

top related