openrisc 1000 yung-luen lan, b9090102. cache model perspective of the programming model. hence, the...

31
OpenRISC 1000 Yung-Luen Lan, b9090102

Upload: shawn-cross

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

OpenRISC 1000

Yung-Luen Lan, b9090102

Page 2: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Model

• Perspective of the Programming Model.

• Hence, the hardware implementation details (cache organization. size, ... etc.) that are invisible to this topic.

• The implementation should work without cache.

Page 3: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Why Cache?

• Memory Hierarchy

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 4: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Concept

• How do we know if a data item is in the cache?

• If it is, how do we find them?

Page 5: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Map to memory

Page 6: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 7: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Coherency

• Multiple Processor

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 8: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Multiprocessor Cache Coherency

• Extra snooping hardware to detect the action of other processors.

• Snoopy protocol– Write-invalidate– Write-update

Page 9: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Snoopy Protocol

• Write-invalidate: The writing processor causes all copies in other caches to be invalidated before changing its local copy. The writing processor issue an invalidation signal over the bus, and all caches check to see if they have a copy. If so, they must invalidate the block.

• Write-update: The writing processor broadcast the new data over the bus, and all copies are then updated with the new value.

Page 10: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Example

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 11: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Example(cont.)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 12: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Special-Purpose Register

Page 13: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

DCCR

• Data Cache Control Register

• Accessible with the l.mtspr/l.mfspr instructions in supervisor mode.

• Enable Ways:

• 0000 0000 All ways disabled/lock

• 1111 1111 All ways enabled/unlock

Page 14: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

ICCR

• Instruction Cache Control Register

• Accessible with the l.mtspr/l.mfspr instructions in supervisor mode.

• Enable Ways:

• 0000 0000 All ways disabled/lock

• 1111 1111 All ways enabled/unlock

Page 15: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Control Register (cont.)

• If the cache does not implement way locking, the DCCR/ICCR is not required to be implemented.

Page 16: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache Management

• Memory accesses caused by cache management are not recorded (unlike load or store instructions) and cannot invoke any exception.

• Instruction caches do not need to be coherent with the memory or caches of other processors. Software must make the instruction cache coherent with modified instructions in the memory. A typical way to accomplish this is:– Data cache block write-back (update of the memory)– l.csync (wait for update to finish)– Instruction cache block invalidate (clear instruction cache block)– Flush pipeline

Page 17: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Data Cache Block Prefetch

• Optional special-purpose register.• accessible with the l.mtspr/l.mfspr instructions in

both user and supervisor modes.• 32bits/64bits• Optional• The DCBPR is written with the effective address

and the corresponding block from memory is prefetched into the cache. (write only)

Page 18: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Data Cache Block Flush

• The DCBFR is written with the effective address. • If coherency is required then the corresponding:

– Unmodified data cache block is invalidated in all processors.– Modified data cache block is written back to the memory and inv

alidated in all processors.– Missing data cache block in the local processor causes that mod

ified data cache block in other processor is written back to the memory and invalidated. If other processors have unmodified data cache block, it is just invalidated in all processors.

• If coherency is not required then the corresponding:– Unmodified data cache block in the local processor is invalidated.– Modified data cache block is written back to the memory and inv

alidated in local processor.– Missing cache block in the local processor does not cause any a

ction.

Page 19: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Data Cache Block Invalidate

• The DCBIR is written with the effective address. If coherency is required then the corresponding:– Unmodified data cache block is invalidated in all processors.– Modified data cache block is invalidated in all processors.– Missing data cache block in the local processor causes that data

cache blocks in other processors are invalidated.

• If coherency is not required then corresponding:– Unmodified data cache block in the local processor is invalidated.– Modified data cache block in the local processor is invalidated.– Missing cache block in the local processor does not cause any a

ction.

Page 20: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Data Cache Block Write-Back

• The DCBWR is written with the effective address.

• If coherency is required then:– the corresponding data cache block in any of t

he processors is written back to memory if it was modified.

• If coherency is not required then:– the corresponding data cache block in the loc

al processor is written back to memory if it was modified.

Page 21: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Data Cache Block Lock

• Optional

• The DCBLR is written with the effective address. The corresponding data cache block in the local processor is locked.

• If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block.

Page 22: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Instruction Cache Block Prefetch

• Optional

• The ICBPR is written with the effective address and the corresponding block from memory is prefetched into the instruction cache.

Page 23: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Instruction Cache Block Invalidate

• The ICBIR is written with the effective address.

• If coherency is required then the corresponding instruction cache blocks in all processors are invalidated.

• If coherency is not required then the corresponding instruction cache block is invalidated in the local processor.

Page 24: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Instruction Cache Block Lock

• Optional• The ICBLR is written with the effective address.

The corresponding instruction cache block in the local processor is locked.

• If all blocks of the same set in all cache ways are locked, then the cache refill may automatically unlock the least-recently used block.

• Missing cache block in the local processor does not caus e any action.

Page 25: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Comparison

Pre-fetch Flush Invalidate Write Back

Lock

Data Optional YES YES YES Optional

Instruction Optional NO YES NO Optional

Page 26: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Cache/Memory Coherency• synchronize.• In systems that do not provide cache coherency with the PTE attribu

tes (because they do not implement a memory management unit), it may be provided through explicit cache management.

• Cache coherency in systems with virtual memory can be provided on a page-by-page basis with PTE attributes. The attributes are:– Cache Coherent (CC Attribute)– Caching-Inhibited (CI Attribute)– Write-Back Cache (WBC Attribute)

• When the memory/cache attributes are changed, it is imperative that the cache contents should reflect the new attribute settings. This usually means that cache blocks must be flushed or invalidated.

Page 27: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Pages Designated as Cache Coherent Pages

• CC=0: do not need cache coherency.

• CC=1: need cache coherency.

• To improve performance of uniprocessor systems, memory pages should not be designated as CC=1.

Page 28: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Pages Designated as Caching-Inhibited Pages

• CI=1, Memory accesses was directly into the main memory, without any cache.

• The target content should never be available in the cache.

• When OS sets a page CI, it should flush the corresponding cache block to prevent the accident copy.

Page 29: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Pages Designated as Write-Back Cache Pages

• WBC=0: Write to both cache and memory.

• WBC=1: Only write to local cache. (Requires cache snooping hardware support)

Page 30: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Debug Unit (optional)

ª›®œ•Œ QuickTime˛ ©M°ßTIFF (LZW)°®∏—¿£¡Yæπ

®”¿Àµ¯¶ππœµe°C

Page 31: OpenRISC 1000 Yung-Luen Lan, b9090102. Cache Model Perspective of the Programming Model. Hence, the hardware implementation details (cache organization

Power Management (optional)

• Slow down feature

• Doze mode

• Sleep mode

• Suspend mode

• Dynamic clock gating feature