13 reduced instruction set computers computer organization

37
13 Reduced Instruction Set Computers Computer Organization

Upload: emerald-gordon

Post on 05-Jan-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 13 Reduced Instruction Set Computers Computer Organization

13Reduced Instruction Set Computers

Computer Organization

Page 2: 13 Reduced Instruction Set Computers Computer Organization

Major Advances in Computers(1)The family concept

IBM System/360 1964DEC PDP-8Separates architecture from implementation

Microprogrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964

Cache memoryIBM S/360 model 85 1969

Page 3: 13 Reduced Instruction Set Computers Computer Organization

Major Advances in Computers(2)Solid State RAM

(See memory notes)Microprocessors

Intel 4004 1971Pipelining

Introduces parallelism into fetch execute cycle

Multiple processors

Page 4: 13 Reduced Instruction Set Computers Computer Organization

The Next Step - RISCReduced Instruction Set Computer

Key featuresLarge number of general purpose registersor use of compiler technology to optimize

register useLimited and simple instruction setEmphasis on optimising the instruction

pipeline

Page 5: 13 Reduced Instruction Set Computers Computer Organization

Comparison of processors

Page 6: 13 Reduced Instruction Set Computers Computer Organization

Driving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:

Large instruction setsMore addressing modesHardware implementations of HLL

statementse.g. CASE (switch) on VAX

Page 7: 13 Reduced Instruction Set Computers Computer Organization

Intention of CISCEase compiler writingImprove execution efficiency

Complex operations in microcodeSupport more complex HLLs

Page 8: 13 Reduced Instruction Set Computers Computer Organization

Execution CharacteristicsOperations performedOperands usedExecution sequencingStudies have been done based on

programs written in HLLsDynamic studies are measured during the

execution of the program

Page 9: 13 Reduced Instruction Set Computers Computer Organization

OperationsAssignments

Movement of dataConditional statements (IF, LOOP)

Sequence controlProcedure call-return is very time

consumingSome HLL instruction lead to many

machine code operations

Page 10: 13 Reduced Instruction Set Computers Computer Organization

Weighted Relative Dynamic Frequency of HLL Operations [PATT82a]

 

Dynamic Occurrence

Machine-Instruction Weighted

Memory-Reference Weighted

  Pascal C Pascal C Pascal C

ASSIGN 45% 38% 13% 13% 14% 15%

LOOP 5% 3% 42% 32% 33% 26%

CALL 15% 12% 31% 33% 44% 45%

IF 29% 43% 11% 21% 7% 13%

GOTO — 3% — — — —

OTHER 6% 1% 3% 1% 2% 1%

Page 11: 13 Reduced Instruction Set Computers Computer Organization

OperandsMainly local scalar variablesOptimisation should concentrate on

accessing local variables

  Pascal C Average

Integer Constant 16% 23% 20%

Scalar Variable 58% 53% 55%

Array/Structure 26% 24% 25%

Page 12: 13 Reduced Instruction Set Computers Computer Organization

Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do not do a lot of calls

followed by lots of returnsMost variables are local(c.f. locality of reference)

Page 13: 13 Reduced Instruction Set Computers Computer Organization

ImplicationsBest support is given by optimising most

used and most time consuming featuresLarge number of registers

Operand referencingCareful design of pipelines

Branch prediction etc.Simplified (reduced) instruction set

Page 14: 13 Reduced Instruction Set Computers Computer Organization

Large Register FileSoftware solution

Require compiler to allocate registersAllocate based on most used variables in a

given timeRequires sophisticated program analysis

Hardware solutionHave more registersThus more variables will be in registers

Page 15: 13 Reduced Instruction Set Computers Computer Organization

Registers for Local VariablesStore local scalar variables in registersReduces memory accessEvery procedure (function) call changes

localityParameters must be passedResults must be returnedVariables from calling programs must be

restored

Page 16: 13 Reduced Instruction Set Computers Computer Organization

Register WindowsOnly few parametersLimited range of depth of callUse multiple small sets of registersCalls switch to a different set of registersReturns switch back to a previously used

set of registers

Page 17: 13 Reduced Instruction Set Computers Computer Organization

Register Windows cont.Three areas within a register set

Parameter registersLocal registersTemporary registersTemporary registers from one set overlap

parameter registers from the nextThis allows parameter passing without

moving data

Page 18: 13 Reduced Instruction Set Computers Computer Organization

Overlapping Register Windows

Page 19: 13 Reduced Instruction Set Computers Computer Organization

Circular Buffer diagram

Page 20: 13 Reduced Instruction Set Computers Computer Organization

Operation of Circular BufferWhen a call is made, a current window

pointer is moved to show the currently active register window

If all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memory

A saved window pointer indicates where the next saved windows should restore to

Page 21: 13 Reduced Instruction Set Computers Computer Organization

Global VariablesAllocated by the compiler to memory

Inefficient for frequently accessed variablesHave a set of registers for global variables

Page 22: 13 Reduced Instruction Set Computers Computer Organization

Registers v CacheLarge Register File Cache

All local scalars Recently-used local scalars

Individual variables Blocks of memory

Compiler-assigned global variables

Recently-used global variables

Save/Restore based on procedure nesting depth

Save/Restore based on cache replacement algorithm

Register addressing Memory addressing

Page 23: 13 Reduced Instruction Set Computers Computer Organization

Referencing a Scalar - Window Based Register File

Page 24: 13 Reduced Instruction Set Computers Computer Organization

Referencing a Scalar - Cache

Page 25: 13 Reduced Instruction Set Computers Computer Organization

Compiler Based Register OptimizationAssume small number of registers (16-32)Optimizing use is up to compilerHLL programs have no explicit references to

registersusually - think about C - register int

Assign symbolic or virtual register to each candidate variable

Map (unlimited) symbolic registers to real registers

Symbolic registers that do not overlap can share real registers

If you run out of real registers some variables use memory

Page 26: 13 Reduced Instruction Set Computers Computer Organization

Graph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same

program fragment are joined by an edgeTry to color the graph with n colors, where n is the number of real registers

Nodes that can not be colored are placed in memory

Page 27: 13 Reduced Instruction Set Computers Computer Organization

Graph Coloring Approach

Page 28: 13 Reduced Instruction Set Computers Computer Organization

Why CISC (1)?Compiler simplification?

Disputed…Complex machine instructions harder to

exploitOptimization more difficult

Smaller programs?Program takes up less memory but…Memory is now cheapMay not occupy less bits, just look shorter in

symbolic formMore instructions require longer op-codesRegister references require fewer bits

Page 29: 13 Reduced Instruction Set Computers Computer Organization

Why CISC (2)?Faster programs?

Bias towards use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to

execute

It is far from clear that CISC is the appropriate solution

Page 30: 13 Reduced Instruction Set Computers Computer Organization

RISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addressing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort

Page 31: 13 Reduced Instruction Set Computers Computer Organization

RISC v CISCNot clear cutMany designs borrow from both

philosophiese.g. PowerPC and Pentium II

Page 32: 13 Reduced Instruction Set Computers Computer Organization

RISC PipeliningMost instructions are register to registerTwo phases of execution

I: Instruction fetchE: Execute

ALU operation with register input and outputFor load and store

I: Instruction fetchE: Execute

Calculate memory addressD: Memory

Register to memory or memory to register operation

Page 33: 13 Reduced Instruction Set Computers Computer Organization

Effects of Pipelining

Page 34: 13 Reduced Instruction Set Computers Computer Organization

Optimization of PipeliningDelayed branch

Does not take effect until after execution of following instruction

This following instruction is the delay slot

Page 35: 13 Reduced Instruction Set Computers Computer Organization

Normal and Delayed BranchAddress Normal Branch Delayed Branch Optimized

Delayed Branch

100 LOAD X, rA LOAD X, rA LOAD X, rA

101 ADD 1, rA ADD 1, rA JUMP 105

102 JUMP 105 JUMP 106 ADD 1, rA

103 ADD rA, rB NOOP ADD rA, rB

104 SUB rC, rB ADD rA, rB SUB rC, rB

105 STORE rA, Z SUB rC, rB STORE rA, Z

106   STORE rA, Z  

Page 36: 13 Reduced Instruction Set Computers Computer Organization

Use of Delayed Branch

Page 37: 13 Reduced Instruction Set Computers Computer Organization

ControversyQuantitative

compare program sizes and execution speedsQualitative

examine issues of high level language support and use of VLSI real estate

ProblemsNo pair of RISC and CISC that are directly

comparableNo definitive set of test programsDifficult to separate hardware effects from

complier effectsMost comparisons done on “toy” rather than

production machinesMost commercial devices are a mixture