cs 300 – lecture 23 intro to computer architecture / assembly language virtual memory pipelining

CS 300 – Lecture 23

Intro to Computer Architecture

/ Assembly Language

Virtual Memory

Pipelining

Final Exam

Tuesday, 8am.

The exam will be comprehensive

I'll hand out a worksheet (practice exam) Thursday that covers topics since the last exam. Let's schedule a review session Monday.

Homework

The last homework is due Friday.

Let's look at the wiki and see what it's about.

Extra credit: if you want it, come see me after class. All EC stuff is due Friday of finals week.

Page Management Strategies

This is really an OS topic but the strategies mirror those in cache managementPages are marked clean / dirty – clean pages never need to swap out. CPU know which pages have been modifiesMuch easier to do LRU swapping.Pages may be shared and retained – note that if you run an application twice it starts up faster the second time.Managing swap space is a big problem – we want to get many pages from swap at the same time.Preloading sequential pages is fast.

Security

Another BIG idea is that you have to use the hardware to deal with managing the security of your system. You can't let just any process access IO devices or mess with the virtual memory configuration (page tables). This is enforced in hardware by associating a "privilege" with every page. On the Pentium, this restricts execution of some instructions.

The OS Kernel

The kernel is the part of the OS that handles memory management, protection management, interrupts, and other key resources. When the kernel screws up you're computer is useless!

VM is a BIG part of the OS kernel.

Getting the kernel right is really important.

Segments

Another issue is segmented memory – dividing a program into chunks of continuous memory. Each chunk may be handled differently and be associated with different virtual addresses. On the MIPS we've seen "text" and "data" segments. You can put stack and heap in separate segments too.

Segments vs Paging

Paging is invisible – the user doesn't know or care what happens at the page level.

Segments are continuous chunks of memory that the user is aware of – some segments are special (read only, shared, copy on write).

The Pentium Again

Segments are layered on top of the paging system. This level is where capabilities are addressed. This includes a privilege level and lots of other stuff. Each segment has a descriptor that tells the hardware where the segment is and what it's capabilities are.

Pentium Segment Descriptor

http://www.internals.com/articles/protmode/protmode.htm

http://www.internals.com/articles/protmode/protmode.htm

The Big Picture

Memory management is one of the essential hardware services provided by the CPU. These are things that can't be done without hardware support.

Security is a big deal – having hardware assistance is essential

Performance issues are more often related to memory issues than computing.

Our Final Topic: Pipelines

Some Interesting History

Back in the 60's, computer designers were faced with some basic issues:

* Some hardware operations (floating point divide) take MUCH longer to execute than others (integer add)

* All logic gates can be used in parallel in a given clock cycle – no need to do "one thing at a time"

* Many instructions don't directly depend on the output of the previous instruction

The IBM Stretch

This was the first real "supercomputer". It's goal was to go 100x faster than the current industry standard, the IBM 704.

While it never met this goal, it pioneered many of the ideas that were solidified in supercomputers such as the Cray.

Big Ideas

* Keep logic circuits busy by avoiding strictly sequential flow of instructions – start future instructions before previous ones are finished

* Speculate so that circuits that would otherwise be idle provide information that may later be useful

* Provide instructions for explicit parallelism / pipelining

* Directly route data between functional units, avoiding delays in memory / registers

What To Do With Extra Silicon

Back on the Stretch, the part of the computer responsible for pipelining provided a significant (x 2 to x 10?) improvement in performance. Yet the total amount of circuitry was no more than a single floating point operation unit. That is, relatively small amounts of circuitry can deliver large speedups.

Speeding Up An Instruction

Typically, an instruction is executed in stages:* Fetch (brings the instruction in from cache / memory)* Decode (figure out what the instruction will do)* Execute (the actual operation, like add or multiply)* Store (place results in registers / memory)This varies a lot from processor to processor but the idea is always the same – break up execution into smaller chunks that overlap.

Other Speedup Strategies

* Vector instructions: explicit parallelism in the instruction set to feed sequences of data to a functional unit (Cray-1 and successors)* Multiple instructions at once: pack instruction words with lots of independent operations that execute at the same time (VLIW)* Replicated CPU, one instruction stream (SIMD)

Pipeline Hazards

Structural: lack of computational resources to perform operations in parallel

Data: dependencies among instructions (write-read)

Control: conditional branching prevents you from knowing which instruction comes next

Memory Hazards

* It's "obvious" which registers an instruction uses* It's hard to figure out how memory accesses interact. Much harder to find out which memory an instruction touches.What can we do?* Reorder reads without worry* Reads and writes can't be switched unless we know that they don't interfere

Compiler / Programmer Help

The compiler and programmer have access to information that the CPU doesn't. We can determine whether or not aliasing is possible – this is what makes it hard to reorganize memory access.

cs 300 – lecture 23 intro to computer architecture / assembly language virtual memory pipelining

Documents