review notes (chapters 1 – 5)€¦  · web view2016. 11. 6. · procedure prolog: code that...

37
REVIEW FROM MIDTERM 1 What to review for the test. how to do binary & hex math & conversion how to design & verify simple circuits IE: adders, mux/demux,decoders ... how to speed up the internal workings of a CPU basic terminology know Boolean Algebra & how to use the laws to manipulate to simple equations building truth tables and how to use them including sum of products What can you do to study? do homework #2, build a full-subtractor design a mux w/o looking it up multiply a few binary numbers (it teaches multiplication and addition & conversion to check the answers) review your notes and the overheads study the 1-bit ALU in the book and become familiar with how it works (Pg 166-167) you only need to study up to page 174 REVIEW FROM MIDTERM 2 What to review for the test. basic terminology latches (flip-flops), registers, memory units...building, addressing how to speed up the internal workings of a CPU o speed vs cost tradeoffs IJVM machine o stack based machines o writing code, registers & operation o reading and understanding the microcode o writing programs in assembly o invoking subroutines branch prediction (static & dynamic) cache parallel instruction execution

Upload: others

Post on 05-Jun-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

REVIEW FROM MIDTERM 1

What to review for the test.

● how to do binary & hex math & conversion● how to design & verify simple circuits IE: adders, mux/demux,decoders ...● how to speed up the internal workings of a CPU● basic terminology● know Boolean Algebra & how to use the laws to manipulate to simple equations● building truth tables and how to use them including sum of products

What can you do to study?

● do homework #2, build a full-subtractor● design a mux w/o looking it up● multiply a few binary numbers (it teaches multiplication and addition & conversion to check the

answers)● review your notes and the overheads● study the 1-bit ALU in the book and become familiar with how it works (Pg 166-167)● you only need to study up to page 174

REVIEW FROM MIDTERM 2

What to review for the test.

● basic terminology● latches (flip-flops), registers, memory units...building, addressing● how to speed up the internal workings of a CPU

o speed vs cost tradeoffs● IJVM machine

o stack based machineso writing code, registers & operationo reading and understanding the microcodeo writing programs in assemblyo invoking subroutines

● branch prediction (static & dynamic)● cache● parallel instruction execution● Intel ISA

o instruction formatso addressing modes

● DMA controller● Interrupts & traps

Page 2: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Review Notes (Chapters 1 – 5)Key Words – CHAPTER 1

Program: A sequence of instructions describing how to perform a certain task

Machine Language: A computer’s primary instructions form a language in which people can communicate with the computer

Structured Computer Organization: Computer systems can be designed in a systematic, organized way

Translation: Converting a program entirely to machine language and forgetting the earlier input

Interpretation: A program is examined and decoded, then carried out immediately

Gates: Built from transistors, and can be combined to form a 1 bit memory

Register: Can hold a single binary number up to some maximum, formed from grouped 1

Digital logic level (LEVEL 0): Contains gates, physical components

Microarchitecture level (LEVEL 1): Collection of 8 to 32 registers that form a local memory and circuit called and ALU (Arithmetic Logic Unit) which is capable of performing simple arithmetic operations

Data path: the registers are connected to the ALU, creating a path over which data flows. The basic operation of the data path consist of selecting one or two registers, have the ALU operate on them, and store the result back in some register

Microprogram: On some machines the operation of the data path is controlled by a microprogram. On machines with software control of the data path, the microprogram is an interpreter for the instructions at level 2

Page 3: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Instruction set architecture level (LEVEL 2): Also known as the ISA level, is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O

Operating system machine level (LEVEL 3): Instructions are identical to level 2, except some of the level 3 instructions are interpreted by the operating system and some are interpreted by the microprogram. This is also known as the “hybrid level”

Assembly language level (LEVEL 4): the program that performs the translation programs in assembly language are first translated to level 1, 2, or 3 language and then interpreted by the appropriate virtual or actual machine

Problem-oriented language level (LEVEL 5): languages designed to be used by application programmers with problems to solve. Examples C, C++, Java, PHP. Programs written in these languages are translated to level 3 or level 4 by compilers

Architecture: the set of data types, operations, and features of each level. The architecture deals with those aspects that are visible to the user of that level

Computer architecture: The study of how to design those parts of a computer system that are visible to the programmers

Hardware: Tangible objects, integrated circuits, power supplies, cables, memory, printers, etc.

Algorithms: Detailed instructions telling how to do something

Software: consists of algorithms and their computer representations (ex. Programs)

Page 4: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Moore’s Law: The number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. Moore predicted that this trend would continue for the foreseeable future

Microcontrollers: a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals

Cache memory: Used to hold the most commonly used memory words inside or close to the CPU, in order to avoid ‘slow’ accesses to main memory

MMX: Multimedia extension, these instructions were intended to speed up computations required to process audio and video. Making the addition of special multimedia coprocessors unnecessary

Key Words – CHAPTER 2

Bus: A collection of parallel wires for transmitting address, data, and control signals

Central processing unit (CPU): The ‘brain’ of the computer, it is comprised of several distinct parts

-Control Unit: Responsible for fetching instructions from main memory & determining their type

-Arithmetic logic unit (ALU): Performs operations such as addition and Boolean algebra needed to carry out the instructions

Program counter (PC): The most important register which points to the next instruction to be fetched for execution

Instruction Register (IR): Holds the instruction currently being executed

Data path cycle: The process of running two operands through the ALU and storing the result

Page 5: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Instruction Execution (Fetch – decode – execute)

1. Fetch the next instruction from memory into the instruction register2. Change the program counter top point to the following instruction3. Determine the type of instruction just fetched4. If the instruction uses a word in memory, determine where it is5. Fetch the word, if needed, into a CPU register6. Execute the instruction7. Go to step 1 to begin executing the following instruction

Interpreter: A program that fetches, examines, and executes the instructions of another program

Reduced instruction set computer (RISC): A small number of simple instructions that execute in one cycle

Complex instruction set computer (CISC): A single large instruction set that executes all at once

Pipeline: An implementation technique where multiple instructions are overlapped in execution. The computer pipeline is divided in stages. Each stage completes a part of an instruction in parallel

Latency: How long it takes to execute an instruction

Processor bandwidth: How many millions of instructions per second (MIPS) the CPU has

Single instruction stream multiple data stream (SMID): Consists of a large number of identical processors that perform the same sequence of instructions on different sets of data

Vector processor: All of the operations are performed in a single, heavily pipelined functional unit

Vector register: A set of conventional registers that can be loaded from memory in a single instruction

Multiprocessor: A system with more than one CPU sharing a common memory

Little endian: Start with big numbers and count smaller, low order

Big endian: Start with small numbers and count big, high order

Parity bit: A bit added to the end of a string of binary code that indicates whether the number of bits in the string with the value one is even or odd. Parity bits are used as the simplest form of error detecting code

Cache: Small fast memory, most heavily used memory words are kept in the cache

Locality principle: When a word is referenced, close related material nearby is also brought into the cache

Page 6: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Single Inline memory module: SIMM, one row of connectors on memory

Dual inline memory module: DIMM, 2 sides of connectors on memory

Small outline DIMM: SO-DIMM, used in laptops

Track: Circular sequence of bits written as the disk makes a complete rotation

Sector: Each track is divided up into some number of fixed lengths, typically containing 512 bytes of data

Preamble: Precedes sectors, allows the head to be synchronized before reading or writing

Intersected gap: A gap between consecutive sectors

Perpendicular recording: The ‘long’ dimension of the bits is not along the circumference of the disk, but vertically down into the iron oxide

Disk controller: a chip that controls the drive

BIOS: Basic Input Output System, located in the PC’s built in read only memory

IDE: Integrated drive electronics

EIDE: Extended IDE, which supports a second addressing scheme LBA (Logical block addressing)

SCSI: Small computer system interface

Page 7: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

RAID: Redundant array of inexpensive disks. The OS treats the drives as one single large logical drive. The controller for the drives does the logical to physical translation.

Striping: Distributing data over multiple drives

Level 0:

View virtual drive as being divided among the physical drives in chunks of N sectors. Dividing the data into chunks of N sectors and distributing it is call Striping. There is no redundancy. If N = 1 then each strip is 1 sector. If a request for 6 consecutive sectors is received the controller issues 4 parallel read request to fetch the first 4 sectors, followed by 2 parallel request. Works well for requests for large blocks of data, but small or single sector request diminish or alleviate any benefits. Failure of 1 physical drive corrupts the entire logical drive

Level 1:

Exactly like level 0, except all drives are duplicated.

Data write occur to both the original & the copy write occur at level 0 speed reads can be up to twice as fast because backups can read in parallel. If a failure occurs a new drive can be substituted and the data copied to it without loss or down-time

Page 8: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Level 2:

Divide each byte into 2 nibbles (4 bits). Add a 3 bit Hamming code to make a 7 bit word. Read/write each 7 bit word in parallel to 7 drives. Single drive failure results in no data loss due to the use of Hamming code. Requires all drives to remain rotationally sync. Requires controller to calc 6 Hamming bits per byte (must be very fast)

Level 3:

Simplified version of level 2.using a single parity bit per nibble. Drive failure results in no data loss position of the drive in bit sequence is known controller assumes bit from that drive is 0, a parity error means it was a 1

Level 4:

Like level 0 but with an extra drive for ECC. Level 4 requires extra overhead to compute & write the ECC (because all drives must be read) so small block are inefficient. ECC drive can be a bottleneck, Drive failure results in no loss of data.

Page 9: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Level 5:

Eliminates the bottleneck on the ECC drive by distributing the ECC bit over all drives evenly, and drive failure results in no data loss, but recovery is complex.

DMA: Direct memory access, a controller that reads or writes data to or from memory without CPU intervention

Interrupt handler: Controller causes and interrupt forcing the CPU to immediately suspend its current program and start running a special procedure that checks for errors and informs the OS that the I/O is finished

Modulation: Varying amplitude, frequency, or phase a sequence of 1’s and 0’s can be transmitted (think sin waves)

Amplitude modulation: two different voltage levels are used for 0 and 1

Frequency modulation: The voltage level is constant but the carrier frequency is different for 1 and 0, often referred to as frequency shift keying

Phase modulation: The amplitude and frequency do not change, but the phase of the carrier is reversed 180 degrees when the data switches from 0 to 1 or 1 to 0

Full-duplex: They can transmit in both directions at the same time (on different frequencies)

Half-duplex: can only transmit in one direction at a time

Simplex: Lines that can transmit in only one direction

Character code: the mapping of characters onto integers

ASCII: American Standard code for Information Interchange

Unicode: Assign ever character and symbol a unique 16-bit value

Key Words – CHAPTER 3

Gates: Tiny electronic devices that can computer various functions of two valued signals

Circuit equivalence: another circuit that computes the same function as the original, but does so with fewer gates

Page 10: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Multiplexer: A circuit with 2n data inputs, one data output and N control inputs that select one of the data inputs

De-multiplexer: Routes its single input signal to one of the 2n outputs, depending on the values of the N control lines

Comparator: Compares two input words

Clock: A circuit that emits a series of pulses with a precise interval between consecutive pulses

Clock cycle time: The time interval between the corresponding edges of two consecutive pulses

SR Latch: Two inputs, S for setting the latch, and R for resetting (clearing) it.

Key Words – CHAPTER 4

IJVM: Integer Java Virtual Machine, only deals with integers

State: Micro-program set of variables

Opcode: (Operation Code) Identifies the instructions, telling whether it is an ADD, BRANCH, or something else

Data Path: Part of the CPU Containing the ALU, its inputs and outputs

MAR: Memory address register

MDR: Memory data register

MIR: Micro-Instruction Register, a memory data register whose function is to hold the current micro-instruction

Local Variable Frame: For each invocation of a method, an area is allocation for storing variables during the lifetime of the invocation

Operand stack: stacks holding operands during the computation of an arithmetic expression

Constant Pool: Consists of constants, strings, and pointers to other areas of memory

Split cache: separate caches or instructions and data. Benefits of having split cache:

-Memory operations can be initiated independently in each cache, effectively doubling the bandwidth of the memory system.

-Stops average people from writing self-modifying code.

-Can help stop viruses from attacking ALL of your data.

Temporal locality: Occurs when recently accessed memory locations are accessed again

Cache line: Main memory is divided up into fixed size blocks typically consisting of 4 to 64 consecutive bytes

Direct-Mapped Cache: Simplest cache, each row in the cache can hold exactly one cache line from main memory

Page 11: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Cache Hit: The cache entry holds the word being referenced again

Cache miss: The cache entry is invalid or the tags do not match, the needed entry is not present in the cache

WAR dependence (Write After Read): One instruction is trying to overwrite a register that a previous instruction may not have yet finished reading

Key Words – CHAPTER 5

Kernel mode: Intended to run the operating system and allows all instructions to be executed

User mode: Intended to run application programs and does not permit certain sensitive instructions to be executed

General Purpose registers: Hold key local variables and intermediate results of calculations. Their main function is to provide rapid access to heavily used data

PSW (Program Status Word): A control register that is something of a kernel /user hybrid also known as the flag register. Holds condition codes

-N: Set when the result is negative

-Z: Set when the result is zero

-V: Set when the result caused an overflow

-C: Set when the result caused a carry out of the leftmost bit

-A: Set when there was a carry out bit

-P: Set when the result had an even parity bit

Prefix byte: An extra Opcode stuck onto the front of an instruction to change its action

Immediate addressing has the virtue of not requiring an extra memory reference to fetch the operand. The disadvantage is that only a constant can be supplied this way

Direct addressing: The instruction will always access exactly the same memory location so while the value can change the location cannot. Can only be used for global variables

Register Mode: Conceptually the same as direct addressing but specifies a register instead of a memory location

Register indirect addressing: Referencing memory without paying the price of having a full memory address in the instruction using a pointer

Indexed addressing: Addressing memory by giving a register and a constant offset

Base indexed addressing: One register is the base and the other is the index, MOV R4 (R2 + R5) AND R4(R2+R6)

Programmed I/O: Commonly used in low-end microprocessors. Usually have a single input instruction and single output instruction. Spends lots of time waiting

Page 12: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Input driven I/O: Using programed I/O but having an interrupt enable bit in a device register, this way the software can request a signal when the I/O is completed

Direct Memory Access: A chip with direct access to the bus, the chip has (at least) four registers inside it.

Trap: An automatic procedure call initiated by some condition caused by the program, usually an important but rarely occurring condition Ex. Overflow

Trap handler: Performs some appropriate reaction, such as printing an error message. If the result is within range, no trap occurs

Interrupts: Changes in the flow of control caused not by the running program but b something else, usually related to I/O ex: Ricky when its 5 mins to go

The essential difference between Traps and Interrupts is this:

-Traps are synchronous with the program

-Interrupts are asynchronous

Page 13: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Boolean Algebra ReviewLaw AND form OR form

Identity 1X=X 0+X=X

Null 0X=0 1+X=1

Idempotent XX=X X+X=X

Inverse XX’=0 X+X’=1

Involution (X’)’= X Same

Commutative XY=YX X+Y=Y+X

Associative (XY)C=X(YC) (X+Y)+C=X+(Y+C)

Distributive X+YC=(X+Y)(X+C) X(Y+C)=XY+XC

Absorption 1 X(X+Y)=X X+XY=X

Absorption 2 X(X’+Y)=XY X+X’Y= X+Y

Consensus(X+Y)(X’+C)(Y+C)= (X+Y)(X’+C)

XY+X’C+YC=XY+X’C

DeMorgan’s (XY)’=X’+Y’ (X+Y)’=X’Y’

NAND & NOR

Page 14: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

1-bit ALU (Pg. 166)

Read page 166!!!!

Instruction Fetch Unit1. The PC is passed through the ALU and incremented2. The PC is used to fetch the next byte in the instruction stream3. Operands are read from memory4. Operands are written to memory5. The ALU does a computation and the results are stored back

Instruction Fetch Unit: An instruction fetch unit (IFU) is an independent unit that can fetch and process instructions. It requires only an incrementer and can independently increment PC and fetch bytes from the byte stream before they are needed.

Page 15: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

InterruptsHow Interrupts are handled

Protocol:

● device sends a signal to the CPU on the system bus

● at the first opportunity the CPU sends ACK to the device

● device sends an ID # (vector #) to CPU

● CPU pushes PSW w/PC onto stack

● uses the ID as an index into the vector jump table

o jump table contains the addresses of all the interrupt service routines (ISR)

● CPU masks the interrupt bits of the PSW

o may disable all interrupts or only interrupts of lower priority (discuss priority based interrupts)

● CPU loads PC w/ jump address, and transfers control to the interrupt service routine

● when the ISR finishes the interrupted program continues

ISR functions – interrupt service routines

● save registers it will use

o restored before the return

o saved on stack or a fixed location

● determines which device signaled

o (ID # is shared by all members of that group of devices)

o any other info needed.

● takes appropriate action to service the interrupt

● restores the saved registers

● executes a return from ISR

Page 16: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

IJVM Instruction set Stack instructionsPush

● LDC_W <index> – push the word at CPP + index

● ILOAD <var#> - push word at LV + <var#>

● BIPUSH <byte> - push <byte> at (PC + 1)

● DUP – duplicate the top value on the stack and push it back

Pop

● ISTORE <var#> - pop word and store at LV + <var#>

● POP – remove from stack & discard

Arithmetic – Pop 2 words, perform op and push result

● IADD – add the values

● ISUB – subtract the values

● IINC <var#> <const> - location LV+<var#> += <const>

(doesn’t POP anything)

Boolean – POP 2 words, perform op & push result

● IAND – AND the values

● IOR - OR the values

Flow of control - <offset> is 16 bit value, PC = PC + <offset>

● GOTO <offset> - unconditional branch

● IFEQ <offset> - if POP = 0 branch

● IFLT <offset> - if POP < 0 branch

● IF_ICMPEQ <offset> - if POP1 = POP2 branch

● INVOKEVIRTUAL <disp> - call a method @ CPP+<disp>

● IRETURN – return from method

Misc.

● NOP – no operation, just waste time

● SWAP – trade the top two values on the stack

● WIDE – next instruction’s parm is 16 bits wide

Page 17: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

IJVM Subroutine call/returnThe call

 

The return

Page 18: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Intel Addressing Modes General form:

[<label>:] <opcode> [<operand list>] [// <comment>]

$ - means immediate addressing

% - means a register reference

( ) - means indirection

// comment or # comment

Operand addressing modes supported1. immediate: actual operand is contained in the instruction

· operand = operand in instr· add $-1, %eax // eax= eax +-1

2. direct: operand specifies the address of the operand· operand = address specified· jmp Next

3. register direct: operand specifies a register containing the operand.· operand = register value· sub %ax, %bx // 16 bit

4. register indirect: register specified contains the address of the operand.· operand = (register)· mov %ebx, (%eax)

5. register indexed· mov %eax, 3(%ebx) // move to loc (ebx + 3)

6. register indexed w/displacement: two operands, a register & a displacement. Operand is at location register + disp.

· operand = (register + disp)· mov %eax, 3(%ebx, %esi) // move to loc (ebx + 3 +esi)

Additional modes are supported but we will not use them. Instructions Restrictions· Instructions can have at most one memory ref (no mem to mem instructions)· Destination can not be immediate mode· Source & dest must be same size

Page 19: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Computer History

Page 20: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

● Basic terminologyo Look through chapter three, chapter four, and chapter five notes.

● Latches (flip-flops p. 172), registers, memory units...building, addressingo Latches are a temporary buffer space and used for timing.

o SR flip-flop

▪ If you put a 1 on the set, it will output a 1.▪ If you put a 1 on the reset, it will output a 0.▪ Truth Table

Input Output

S R Q Q’

0 0 Q Q’

0 1 0 1

1 0 1 0

1 1 U U

U = Unstable

o D flip-flop

▪ Basically a latch or piece of memory.▪ If you put a 1 in, it will output a 1.▪ If you put a 0 in, it will output a 0.▪ To see a D flip flop made from an SR flip flop, look @ notes.

Page 21: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

▪ Since there is only one input, you can’t get 11 or 00.o Registers

SP Stack pointer

LV Local variable pointer

TOS Value at the top of stack

H Temporary storage

▪ Use microcode to clear and control the reset line, input data strobe, and output data strobe.

● Why microcode? Because you can update and change it whereas hardware is more difficult.

▪ Shift Register:  A register that is designed to allow the bits of its contents to be moved to left or right.

o Memory Unit▪ Memory Model

Operand Stack

Local vars nLocal vars 1

Constants

Program Store

▪ Four areas of memory are defined (low to high)● Method Area:  Contains the functions to be executed. PC points to

the next instruction to execute.● Constant pool:  CPP points to the first word of the pool. Pool

contains all the literals, constants, and pointers to other areas set by the system.

● Local Var. Frame:  (parms, return addresses, automatic local vars) LV points to the start of the current allocation block.

● Operand Stack:  Above the local variable frame. Holds the intermediate calculations. SP points to top of stack.

▪ Memory Control Register● MAR:  Memory address register● MDR:  Memory data register● PC:  Program counter● MBR:  Memory branch register

▪ Memory Allocation (5 areas)

I/O Buffers

Page 22: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

Program Stack

Heap

Constants/Globals

User Program

● How to speed up the internal workings of a CPU (Speed vs. Cost) (p. 283)o Speed can be measured in a variety of ways and there are three approaches for

increasing the speed of execution:▪ Reduce the number of clock cycles needed to execute an instruction.▪ Simplify the organization so that the clock cycle can be shorter.▪ Overlapping the execution of instructions is by far the most interesting

approach and offers the most opportunity for dramatic increases in speed.o Path Length:  The number of clock cycles needed to execute a set of operations.o Reducing the Execution Path Length (p. 285)

▪ Merging the Interpreter Loop with the microcode● In the Mis-1, the main loop consists of one microinstruction that

must be executed at the beginning of every IJVM instruction. In some cases, it is possible to overlap it with the previous instruction.

▪ A Three-Bus Architecture● Another easy fix is to have two full input buses to the ALU, a A

bus and a B bus ,giving three buses in total.▪ An Instruction Fetch Unit

● Helps achieve a more dynamic improvement.● We need to look at the common parts of every instruction and

notice that for every instruction, the following operations may occur:

1. The PC is passed through the ALU and incremented.2. The PC is used to fetch the next byte in the instruction stream.3. Operands are read from memory.4. Operands are written to memory.5. The ALU does a computation and the results are stored back.

▪ An instruction fetch unit (IFU) is an independent unit that can fetch and process instructions. It requires only an incrementer and can independently increment PC and fetch bytes from the byte stream before they are needed.

▪ The IFU can also assemble 8- and 16-bit operands so that they are ready for use whenever needed. There are at least two ways this can be done:

1. The IFU can actually interpret each opcode, determining how many additional fields must be fetched, and assemble them into a register ready for use by the main execution unit.

2. The IFU can take advantage of the stream nature of the instructions and make available at all times the next 8- and 16-bit pieces, whether or not doing so makes any sense. The main execution unit can then ask for whatever it needs.

Page 23: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

o In order to speed up the instruction set, the third technique - Overlapping the execution of instructions - must be exploited.

o Cost can be measured in many ways such as a count of the number of components or how much time it takes to do an operation.

IJVM machineo stack based machineso writing code, registers & operationo reading and understanding the microcodeo writing programs in assembly

Source Code Object

A = 0;         BIPUSH 0         ISTORE A

0x10 0x000x36 0x01

B= 7;    BIPUSH 7

         DUP        ISTORE B

0x10 0x070x59

0x36 0x02

C= 3;         BIPUSH 3            DUP

            ISTORE C

0x10 0x000x59

0x36 0x03

D = B + C;          IADD             DUP             DUP

             ISTORE D

0x600x590x59

0x36 0x04

If (D > 0)A = 5;

             IFEQ  ELSE1          IFLT   ELSE2

             BIPUSH 5             GOTO NEXT

0x99 0x00 0x0B0x9B 0x00 0x09

0x10 0x050xA7 0x00 0x05

ElseA = 7;

ELSE1: POPELSE2:  BIPUSH 7NEXT:  ISTORE A

0X570x10 0x070x36 0x01

B = 0;             BIPUSH 0            ISTORE B

0x10 0x000x36 0x02

for (C = 0; C < 3; C++)B = B + 2;

            BIPUSH 3            ISTORE CLOOP: ILOAD C            IFEQ END            IINC B 2            IINC C -1

0x10 0x030x36 0x030x15 0x03

0x99 0x00 0x0C0x87 0x03 0x020x87 0x03 0x82

Page 24: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

            GOTO LOOP 0xA7 0x80 0x0B

END:

o invoking subroutines

Branch prediction (static & dynamic)o Why and how?o Static

▪ All decisions are made at compile time.o Dynamic

▪ Predicting while other things are going on.▪ Carried out at run time while the program is running.▪ Downside:  They require specialized and expensive hardware and are

complex.

Cacheo Split Cache:  Separate cache for instructions and data.

▪ Benefits:● Memory operations can be initiated independently in each cache,

effectively doubling the bandwidth of the memory system.▪ Bus Arbiter:  Makes sure that no two sets of data try to get to the bus at the

same time. Randomly selects one piece to go before the other.o Many memory systems are more complicated than this and an additional cache,

called a level 2 cache, may reside between the instruction and data caches and main memory.

o Spatial Locality:  Observation that memory locations with addresses numerically similar to a recently accessed memory location are likely to be accessed in the near future.

o Temporal Locality:  Occurs when recently accessed memory locations are accessed again.

o Cache Lines:  Main memory is divided up into fixed size blocks and typically consists of 4 to 64 consecutive bytes.

o Direct-Mapped Cache:  Simplest cache. A cache where the cache location for a given address is determined from the middle address bits.

o Cache Hit:  A cache hit is a state in which data requested for processing by a component or application is found in the cache memory. It is a faster means of delivering data to the processor, as the cache already contains the requested data.

▪ A cache hit occurs when an application or software requests data. First, the central processing unit (CPU) looks for the data in its closest memory location, which is usually the primary cache. If the requested data is found in the cache, it is considered a cache hit.

Page 25: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

▪ A cache hit serves data more quickly, as the data can be retrieved by reading the cache memory. The cache hit also can be in disk caches where the requested data is stored and accessed at first query.

o Cache Miss:  Cache miss is a state where the data requested for processing by a component or application is not found in the cache memory. It causes execution delays by requiring the program or application to fetch the data from other cache levels or the main memory.

▪ Cache miss occurs within cache memory access modes and methods. For each new request, the processor searched the primary cache to find that data. If the data is not found, it is considered a cache miss.

▪ Each cache miss slows down the overall process because after a cache miss, the central processing unit (CPU) will look for a higher level cache, such as L1, L2, L3 and random access memory (RAM) for that data. Further, a new entry is created and copied in cache before it can be accessed by the processor.

Parallel instruction executiono Instruction scheduling

▪ Since these rules are in place, the CPU is free to schedule instructions within a group in any order it chooses, possibly in parallel, without having to worry about conflicts.

Intel ISAo instruction formats

▪ Data Movement Instructions● Copying data from one place to another is the most fundamental of

all operations and by copying, we mean creating a new object.● When we say the contents of memory location 2000 have been

moved to some register, we always mean that an identical copy has been created there and that the original is still undisturbed in location 2000.

▪ Dyadic Operations● Combine two operands to produce a result.

▪ Monadic Operations● Has one operand and produces one result. Because one fewer

address has to be specified than with dyadic operations, the instructions are sometimes shorter, though often other information needs to be specified.

o addressing modes▪ Addressing:  Most instructions have operands, so some way to specify

where they are and this is called addressing.▪ Address Modes:  Up until now, we have payed little attention to how the

bits of an address field are interpreted to find the operand, called address modes.

▪ Immediate Addressing:  The simplest way for an instruction to specify an operand is for the address part of the instruction actually to contain the

Page 26: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

operand itself rather than an address or other information describing where the operand is.

▪ Direct Addressing:  A method for specifying an operand in memory is just to give its full address.

▪ Register Addressing:  Conceptually the same as direct addressing but specifies a register instead of memory location.

▪ Register Indirect Addressing:  The operand being specified comes from memory or goes to memory, but its address is not hardwired into the instruction, as in direct addressing.

▪ Indexed Addressing:  Addressing memory by giving a register plus a constant offset.

● MOVE 6A, Bo Addr = 6+A

▪ Based-Indexed Addressing:  Some machines have an addressing mode in which the memory address is computed by adding up two registers plus an (optional) offset.

▪ Stack Addressing● Make machine instructions as short as possible.● Reverse Polish Notation

o Infix:  The form with the operator “in” between the operands.

o Postfix (Reverse Polish Notation):  The form with the operator after the operands.

o Ideal notation for evaluating formulas on a computer with a stack.

o Ex:  (8+2*5) / (1+3*2-4)  --->  8 2 5 * + 1 3 2 * + 4 - /

DMA controllero Direct Memory Access (DMA):  Capability provided by some computer bus

architectures that allows data to be sent directly from an attached device (such as a disk drive) to the memory on the computer's motherboard.

▪ DMA:  A little computer that moves stuff from secondary, etc. storage and put it into memory.

▪ I/O Pulling:  The CPU has to constantly be checking to see if it is done (very expensive).

▪ Interrupt driven solution:  It’s a doorbell that lets the CPU continue doing what it needs to do and then gets alerted when something is done.

Trapso Traps

▪ Trap:  A kind of automatic procedure call initiated by some condition caused by the program, usually an important but rarely occurring condition.

▪ Trap Handler:  Performs some appropriate action, such as printing an error message.

Page 27: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

o Self-Modifying Code:  Code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance.

o Procedures▪ Recursive Procedure:  A procedure that calls itself, either directly or

indirectly via a chain of other procedures.▪ Procedure Prolog:  Code that saves the old frame pointer, sets up a new

one, and advances the stack pointer to reserve space for local variables.▪ Procedure Epilog:  Upon procedure exit, the stack must be cleaned up

again.● One of the most important characteristics of any computer is how

short and fast it can make the prolog and epilog. If they’re long and slow, procedure calls will be expensive.

o Coroutines:  Computer program components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations.

Review Problems1. List the 5 machine levels, and give an example of each.

Page 28: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

2. What is split cache? Explain the benefits of using it.

3. Give an example where using static branch prediction would be beneficial.

4. Explain the difference between traps and interrupts, give an example of each.

5. What is an instruction fetch unit and what are the step-by-step operations needed to be executed?

6. Give three approaches to increase execution speed of a program.

7. Prove (A’B’) = (A+B)’

Page 29: Review Notes (Chapters 1 – 5)€¦  · Web view2016. 11. 6. · Procedure Prolog: Code that saves the old frame pointer, sets up a new one, and advances the stack pointer to reserve

8. What is the difference between maco and function calls?

9. Make a GNU function that gets two numbers, adds then, and prints them out.

10. What are the differences between RISC and SISC based machines?

11. Draw a 3 bit dynamic branch prediction diagram.