chapter 2: advanced computer architecture

101
Instruction Set Principles and Examples UNIT - 2

Upload: tigabu-yaya

Post on 19-Feb-2018

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 1101

Instruction Set Principles and ExamplesUNIT - 2

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 2101

Classification of Instruction SetArchitectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101

Instruction SetInstruction SetDesignDesi

gn

Multiple Implementations 8086 Pentium

ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6

instruction set

software

hardware

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101

MIPS (originally an acronym for MicroprocessorMIPS

(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi

peline Stages)

MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and

MIPSamp define a control register set as ell as theinstruction set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101

Typical Processor Execution CycleT

ypical Processor Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in register or storage for later use

Determine successor instruction

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 2: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 2101

Classification of Instruction SetArchitectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101

Instruction SetInstruction SetDesignDesi

gn

Multiple Implementations 8086 Pentium

ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6

instruction set

software

hardware

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101

MIPS (originally an acronym for MicroprocessorMIPS

(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi

peline Stages)

MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and

MIPSamp define a control register set as ell as theinstruction set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101

Typical Processor Execution CycleT

ypical Processor Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in register or storage for later use

Determine successor instruction

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 3: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101

Instruction SetInstruction SetDesignDesi

gn

Multiple Implementations 8086 Pentium

ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6

instruction set

software

hardware

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101

MIPS (originally an acronym for MicroprocessorMIPS

(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi

peline Stages)

MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and

MIPSamp define a control register set as ell as theinstruction set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101

Typical Processor Execution CycleT

ypical Processor Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in register or storage for later use

Determine successor instruction

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 4: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101

MIPS (originally an acronym for MicroprocessorMIPS

(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi

peline Stages)

MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and

MIPSamp define a control register set as ell as theinstruction set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101

Typical Processor Execution CycleT

ypical Processor Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in register or storage for later use

Determine successor instruction

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 5: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101

Typical Processor Execution CycleT

ypical Processor Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in register or storage for later use

Determine successor instruction

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 6: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101

Instruction and Data Memory Unified or SeparateInstruction and Data Memor

y Unified or Separate

ADDSRACAD$RC$MPARampampamp

ampampamp

Programmers View

Computers View

CP

Memory

I$

Computer Program

(Instructions)

Princeton (on eumann) Architecture

$$$ ata and Instructions mied in same

unified memory

$$$ Program as data

$$$ Storage utili-ation

$$$ Single memory interface

+arard Architecture

$$$ ata Instructions in

separate memories

$$$ as advantages in certain high performance implementations

$$$ Can optimi-e each memory

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 7: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101

Classifying instruction set ArchitecturesClassif ying instruction set Architectures

There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)

ampSet of 1egisters (1egister$1egister2load$store)

The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one

operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 8: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101

asic Addressing Classesasic Addressing Classes

$eclinin cost of reisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 9: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101

perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses

The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand

hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the

Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a

register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 10: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101

Code Seuence for C$ACode Seuence for C$A

Stack Accumulator Register-memory Register-register

Push A (oa) A (oa) + A (oa) + A

Push A)) A)) amp+ (oa) 2

A)) Store C Store amp C A)) amp + 2

Pop C Store amp C

he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 11: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101

Stacamp ArchitecturesStacamp Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 12: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101

Accumulator ArchitecturesAccumulator Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 13: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101

egister(Set Architectures egister(Set Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 14: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101

egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 15: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101

egister(to(Memory Architectures egister(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 16: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101

Memory(to(Memory ArchitecturesMemory(to(Memory Architectures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 17: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101

Instruction ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 18: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101

Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )

To command a computer9s hardare you must spea0 its

language The ords of a machine9s language are called instructions and

its vocabulary is called instruction set

5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide

All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost

3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept

The MIPS instruction set is used as a case study

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 19: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101

Interface DesignInterface Design A good interface

3asts through many implementations (portability compatibility)

Is used in many different ays (generality) Provides convenient functionality to higher levels

Permits an efficient implementation at loer levels

Design decisions must take into account

Technology

Machine organi-ation

Programming languages

Compiler technology

5perating systems

Interface

imp

imp 0

imp 1

use

use

use

i m e

Cl if i I t ti S t A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 20: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101

Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory

lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added

bull The =gt= microprocessor is a an eample of of such special$purpose register arch

eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role

bull This type of instruction set can be further divided into6

bull Register-memory allos for one operand to be in memory

bull Register-register (load-store) demands all operands to be in registers

Machine 2 general3purposeregisters

Architecture style 4ear

Motorola =gtgt Accumulator Bamp

ltC A 1egister$memory memory$memory BB

Intel =gt= lttended accumulator B=

Motorola =gtgtgt 1egister$memory =gt

Intel =gt= 1egister$memory =

PoerPC 3oad$store

ltC Alpha 3oad$store

C C d d S k A hi

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 21: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101

Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length

instructions to match varying operand specifications and minimi-e code si-e

Stac0 machines abandoned registers altogether arguing that it is hard for

compilers to use them efficiently

5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory

5perations ta0e their operand by default from the top of the stac0 and insert

the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact

instruction encoding but limit compiler optimi-ation (eg in math epressions)

Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+

Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp

Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)

$th t f A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 22: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101

$ther types of Architecture igh$3evel$3anguage Architecture

bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly

bull Some people blamed the code density on the instruction set rather than theprogramming language

bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages

bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient

compilers doomed this philosophy to a historical footnote

1educed Instruction Set Architecture

bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding

bull Instruction set architecture became measurable in the ay compilers rather

programmable use them

bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations

bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 23: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101

olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)

Accumulator F Inde 1egisters(anc$ester ark amp series 1)

Separation of Programming Model from Implementation

+igh3leel 5anguage ased Concept of a 6amily

( 1) ( 1+)

eneral Purpose 1egister Machines

Comple7 Instruction Sets 5oadStore Architecture

RISC

(axamp ntel + 1-) (CDC amp Cray 1 1-)

(SampSARCamp RSamp 0 0 01)

R i t M A hit t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 24: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101

2 memoryaddresses

Ma7amp num8erof operands

7amples

gt SPA1C MIPS PoerPC A3PA

Intel gt= Motorola =gtgtgt

A (also has operands format)

A (also has operands format)

Register3Memory Architectures

Eect o the numer o memor operands

M Add

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 25: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101

Memory AddressInterpreting Memory Addressing

The address of a ord matches the byte address of one of its amp bytes

The addresses of seJuential ords differ by amp (ord si-e in byte)

ords9 addresses are multiple of amp (alignment restriction)

Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK

Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)

8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord

$89ectaddressed

Aligned at8yte offsets

Misaligned at8yte offsets

8yte ampB 7ever

alf ord gtamp B

Dord gtamp B

ouble ord gt ampB

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 26: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101

Addressing Modes

Addressing modes refer to ho to specify the location of anoperand (effective address)

Addressing modes have the ability to6

Significantly reduce instruction counts

Increase the average CPI

Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports

ide range of memory addressing modes

Lamous addressing modes can be classified based on6

the source of the data into register immediate ormemory

the address calculation into direct and indirect An indeed addressing mode is usually provided to allo

efficient implementation of loops and array access

ample of Addressing Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 27: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101

7ample of Addressing ModesAddressamp mode 7ample Meaning hen used

1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5

Regs2R)3Dhen a value is in a register

Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants

isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3

Accessing local variables

1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5

em2Regs2R13 3 Accessing using a pointer or a

computed address

Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5

Regs2R-33

Sometimes useful in array

addressing6 1 E base of the

array6 1 E inde amount

irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5

em2 11 3 Sometimes useful for accessingstatic dataH address constant

may need to be large

Memory indirect or

memory deferred

A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33

If 1 is the address of the

pointer p then mode yields Np

Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Regs2R-3 4 Regs2R-3 5 d

4seful for stepping through

arrays ithin a loop 1 points to

start of the arrayH each reference

increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d

Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3

Same use as autoincrement

Autodecrement2increment can

also act as push2pop to

implement a stac0

Scaled A 1amp gtgt (1)

1+

Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5

Regs2R)3 7 d3

4sed to inde arrays

Add i M d f Si l P i

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 28: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101

Addressing Mode for Signal Processing

6ast 6ourier ransform

gt (gtgtgt) gt (gtgtgt)

(gtgt) amp (gtgt)

(gtgt) (gtgt)

(gt) (gt)

amp (gtgt) (gtgt)

(gt) (gt)

(gt) (gt)

B () B ()

Modulo addressing

Since SP deals ith continuous data streamscircular buffers are idely used

Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer

Reerse addressing

1esulting address is the reverse order of thecurrent address

1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access

SP offers special addressing modes to better serve popular algorithms

Special features reJuires either hand coding or a compiler that uses such

features (74 ould not be a good choice)

$ ti f th C t + d

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 29: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101

$perations of the Computer +ardware

89$ere must certainly e instructions for performing t$efundamental arit$metic operations0

8ur0es oldstine and on 7eumann ampB

Assembly language is a symbolic representation of hat the processor actually understand

MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line

7ample6

ranslation of a segment of a C program to MIPS assem8lyinstructions

C6 f E (g F h) $ (i F O)

MIPS6

add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)

$ ti i th I t ti S t

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 30: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101

$perator type 7amples

Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or

ata Transfer 3oads$stores (move instructions on machines ith memory addressing)

Control 8ranch Oump procedure call and return trap

System 5perating system call irtual memory management instructions

Lloating point Lloating point instructions6 add multiply

ecimal ecimal add decimal multiply decimal to character conversion

String String move string compare string search

raphics Piel operations compression2decompression operations

$perations in the Instruction Set

Arithmetic logical data transfer and control are almost standard categoriesfor all machines

System instructions are reJuired for multi$programming environmentsalthough support for system functions varies

ecimal and string instructions can be primitives eg I8M gt and the A

Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor

Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions

$ ti f M di lt Si l P

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 31: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101

$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions

are often supported in SPs hich are commonly used in

multimedia and signal processing applications

Partitioned Add (integer)

Perform multiple $bit addition on a amp$bit A34 since most data are narro

Increases A34 throughput for multimedia applications

Paired single operations (float)

Allo same register to be acting as to operands to the same operation

andy in dealing ith vertices and coordinates

Multiply and accumulate

ery handy for calculating dot products of vectors (signal processing) andmatri multiplication

6re-uency of $perations sage

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 32: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101

Rank =7=gt InstructionInteger Aerage

( total e7ecuted)

3oad

Conditional branch gt

Compare

amp Store

Add =

And B Sub

= Move register$register amp

Call

gt 1eturn

Total

6re-uency of $perations sage

Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations

The most idely eecuted instructions are the simple operations of aninstruction set

The folloing is the average usage in SPltCint on Intel =gt=

Control 6low Instructions

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 33: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101

Control 6low Instructions

ltump for unconditional change in the control flo

ranc$ for conditional change in the control flo

Procedure calls and returns

Data is ased on SEC on Alp$a

Destination Address Definition

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 34: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101

Destination Address Definition

1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)

To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers

(eg virtual functions in CFF and system calls in a case statement)

Data is ased SEC on Alp$a

Condition aluation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 35: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101

Condition aluation

Comparebranch can be efficient if maOorityof conditions are comparison ith -ero

Remem8er to focuson the common case

Remem8er to focuson the common case

8ased on SPltC on MIPS

6re-uency of ypes of Comparison

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 36: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101

6re-uency of ypes of Comparison

Data is ased on SEC on Alp$a

Different 8enchmark and machine set new design

priority

Different 8enchmark and machine set new design

priority

SPs support repeat instruction for for loops (vectors) using registers

Supporting Procedures

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 37: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101

Supporting Procedures ltecution of a procedure follos the folloing steps6

Store parameters in a place accessible to the procedure

Transfer control to the procedure

AcJuire the storage resources needed for the procedure Perform the desired tas0

Store the results value in a place accessible to the calling program

1eturn control to the point of origin

The hardare provides a program counter to trace instruction flo andmanage transfer of control

Parameter Passing

1egisters can be used for passing small number of parameters

A stac0 is used to spill registers of the current contet and ma0e room for

the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee

andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface

lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling

ype and Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 38: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101

ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs

operation code

The type of an operand eg single precision float effectively gives its si-e

Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point

Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp

The $bit 4nicode used in ava is gaining popularity due its support for the international character sets

Lor business applications some architecture support a decimal format in binary coded decimal (8C)

epending on the si-e of the ord the compleity of handling different operand types differs

SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers

Lor raphics applications verte and piel operands are added features

Sie of $perands

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 39: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101

ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus

Dords are used for integer operations and for $bit address bus machines

8ecause the mi in SPltC ord and double$ord data types dominates

Sie of $perands

LreJuency of reference by si-e based on SPltCgtgtgt on Alpha

Instruction Representation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 40: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101

Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be

represented in any base ( in base gt E gt in binary or base )

7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)

8inary digits are called bits and considered the atom of computing

ltach piece of an instruction is a number and placing these numberstogether forms the instruction

Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)

ltample6

Assembly6 add Rtgt Rs Rs

M2C language (decimal)6

M2C language (binary)6

Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E

gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s

gt B gt= =

ncoding an Instruction Set

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 41: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101

ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the

compleity of the CP4 implementation

The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation

or specified through a separate identifier in case of large number ofsupported modes

The architecture must balance beteen several competing factors6

esire to support as many registers and addressing modes as possible

ltffect of operand specification on the si-e of the instruction (program)

esire to simplify instruction fetching and decoding during eecution

Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported

An architect caring about the code si-e can use variable si-e encoding

A hybrid approach is to allo variability by supporting multiple$si-edinstruction

ncoding 7amples

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 42: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101

ncoding 7amples

MIPS Instruction format

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 43: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101

MIPS Instruction format Register3format instructions

op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field

Immediate3type instructions

Some instructions need longer fields than provided for large value constant

The $bit address means a load ord instruction can load a ord ithin a

region of plusmn

bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address

add 1 gt reg reg reg gt 72A

sub 1 gt reg reg reg gt amp 72A

l I reg reg 72A 72A 72A address

s I amp reg reg 72A 72A 72A address

o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s

o p r s a d d r e s sr t b i t s b i t s b i t s b i t s

he Stored Program Concepthe Stored Pro

gram Concept

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 44: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101

he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering

the secret of computing6 the stored$program concept

TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers

Programs can be stored in memory to beread or ritten Oust li0e numbers

he power of the concept

memory can contain6

the source code for an editor

the compiled m2c code for the editor

the tet that the compiled program is using

the compiler that generated the code

P r o c e s s o r

A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )

lt d i t o r p r o g r a m( m a c h i n e c o d e )

C c o m p i l e r ( m a c h i n e c o d e )

P a y r o l l d a t a

8 o o 0 t e t

S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m

M e m o r y

Compiling if3then3else in MIPS

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 45: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101

Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement

if (i 44 lt) f 4 g 5 $ else f 4 g - $

i E E O

f E g U hf E g F h

lt l s e 6

lt i t 6

i E O i ne O

bne Rs Rsamp ltlse G go to ltlse if i ne O

add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)

O ltit

ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)

ltit6

MIPS

ypical Compilation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 46: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101

ypical Compilation

Ma9or ypes of $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 47: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101

$ptimiation ame 7planation 6re-uency

+igh Fleel

Procedure integration

$t or near source leelamp machine indep

1eplace procedure call by procedure body 7M

5ocal

Common sub$ epressionelimination

Constant propagation

Stac0 height reduction

(ithin straight line code

1eplace to instances of the same computation bysingle copy

1eplace all instances of a variable that is assigned aconstant ith the constant

1earrange epression tree to minimi-e resourcesneeded for epression evaluation

=

7M

Glo8al

lobal common subepression elimination

Copy propagation

Code motion

Induction variable

elimination

$cross a ranch

Same as local but this version crosses branches

1eplace all instances of a variable A that has beenassigned (ie A E ) ith

1emove code from a loop that computes same value

each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops

Machine3dependant

Strength reduction

Pipeline Scheduling

Depends on machine )nowledge

Many eamples such as replace multiply by aconstant ith adds and shifts

1eorder instructions to improve pipeline performance

7M

7M

Ma9or ypes of $ptimiation

ffect of Complier $ptimiation

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 48: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101

easurements taken on S

P r o g r a m a

n d C o m p i l e r $ p t i m i a t i

o n 5 e e l

e=el 6 non$optimi-ed code

e=el 16 local optimi-ation

e=el 6 global optimi-ation s2 pipelining

e=el 6 adds procedure integration

ffect of Complier $ptimiation

Compiler Support for Multimedia Instr

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 49: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101

IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)

Intel added ne set of instructions called Streaming SIM lttension

A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data

transfer

ector computers typically have strided and2or gather2scatter addressing to

perform operations on distant memory locations Strided addressing allos memory access in increment larger than one

ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data

Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup

Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines

Compiler Support for Multimedia Instramp

SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives

Starting a Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 50: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101

Starting a Program

A s s e m b l e r

A s s e m b l y l a n g u a g e p r o g r a m

C o m p i l e r

C p r o g r a m

3 i n 0 e r

lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m

3 o a d e r

M e m o r y

5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

$ Place code data modules

symbolically in memory

$etermine the address of data instruction labels

$Patch both internal eternal ref

5bOect files for 4ni typically contains6

eader6 si-e position of components

Tet segment6 machine code

ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref

Symbol table6 name location of labelsprocedures and variables

ebugging info6 mapping source to obOectcode brea0 points etc

5inker

5oading 7ecuta8le Program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 51: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101

R s p

R g p

gt gt amp gt gt gt gt gth e

gt

gt gt gt gt gt gt gt h e

T e t

S t a t i c d a t a

y n a m i c d a t a

S t a c 0B f f f f f f f

h e

gt gt gt = gt gt gth e

p c

1 e s e r v e d

5oading 7ecuta8le Program

To load an eecutable the operating systemfollos these steps6

1eads the eecutable file header todetermine the si-e of tet and data segments

Creates an address space large enough forthe tet and data

Copies the instructions and data from the

eecutable file into memory

Copies the parameters (if any) to the mainprogram onto the stac0

Initiali-es the machine registers and sets thestac0 pointer to the first free location

umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 52: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 53: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101

Instruction Set Design IssuesInstruction Set Design Issues

Instruction Set esign Issues 7umber of Addresses

Llo of Control

5perand Typesamp Addressing Modes

Instruction Types

Instruction Lormats

um+er of Addressesum+er of Addresses

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 54: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101

um+er of Addressesum+er of Addresses

Lour categories

$address machines$ for the source operands and one for the result

$address machines

$ 5ne address doubles as source and result

$address machine$ Accumulator machines

$ Accumulator is used for one source and result

gt$address machines

$ Stac0 machines

$ 5perands are ta0en from the stac0

$ 1esult goes onto the stac0

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 55: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101

um+er of Addresses cont-um+er of Addresses cont-

Three$address machines

To for the source operands one for the result

1ISC processors use three addresses

Sample instructions

add destsrc1src2

M(dest)=[src1]+[src2]

sub destsrc1src2

M(dest)=[src1]-[src2]

mult destsrc1src2

M(dest)=[src1][src2]

Three addresses

Operand 1 Operand 2 Result

Example a = b + c

Three-address instruction formats are not common because they reuire a

relatiely lon instruction format to hold the three address references

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 56: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

mult TCD T = CD

add TTB T = B+CD

sub TTE T = B+CD-E

add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 57: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101

um+er of Addresses cont-um+er of Addresses cont-

To$address machines

5ne address doubles (for source operand result)

3ast eample ma0es a case for it

$ Address T is used tice

Sample instructions

load destsrc M(dest)=[src]

add destsrc M(dest)=[dest]+[src]

sub destsrc M(dest)=[dest]-[src]

mult destsrc M(dest)=[dest][src]

Two Addresses

One address doubles as operand and resultExample a = a + b

The t$o-address formal reduces the space reuirement but also

introduces some a$$ardness To aoid alterin the alue of an

operand a ampOE instruction is used to moe one of the alues to a

result or temporary location before performin the operation

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 58: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

load TC T = C

mult TD T = CD

add TB T = B+CD

sub TE T = B+CD-Eadd TF T = B+CD-E+F

add AT A = B+CD-E+F+A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 59: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101

um+er of Addresses cont-um+er of Addresses cont-

5ne$address machines 4se special set of registers called accumulators

$ Specify one source operand receive the result

Called accumulator machines

Sample instructions

load addr accum = [addr]

store addr M[addr] = accumadd addr accum = accum + [addr]

sub addr accum = accum - [addr]

mult addr accum = accum [addr]

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 60: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statementA C H D F 6 A

ltJuivalent code6

load C load C to accum

mult D accum = CD

add B accum = CD+B

sub E accum = B+CD-Eadd F accum = B+CD-E+F

add A accum = B+CD-E+F+A

store A store accum cotets A

um+er of Addresses cont -um+er of Addresses

cont -

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 61: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101

um+er of Addresses cont-um+er of Addresses cont-

Vero$address machines

Stac0 supplies operands and receives the result$ Special instructions to load and store use an address

Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)

Sample instructions

us addr us([addr])

o addr o([addr])

add us(o + o)

sub us(o - o) mult us(o o)

um+er of Addresses cont -um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 62: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101

um+er of Addresses cont-um+er of Addresses cont-

ltample

C statement

A C H D F 6 A

ltJuivalent code6

us E sub

us C us F

us D add

Mult us A

us B add

add o A

)oadStore Architecture)oadStore Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 63: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101

)oadStore Architecture)oadStore Architecture

Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers

and memory

1ISC uses this architecture

1educes instruction length

()

)oadStore Architecture cont-)oadStore Architecture

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 64: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101

)oadStore Architecture cont-)oadStore Architecture cont-

Sample instructionsload $daddr $d = [addr]

store addr$s (addr) = $s

add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp

mult $d$s$samp $d = $s $samp

um+er of Addresses cont-um+er of Addresses

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 65: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101

um+er of Addresses cont-um+er of Addresses cont-

ampleC statement

A = B + C D E + F + A

1uialent co)eload $B mult $amp$amp$

load $ampC add $amp$amp$

load $D sub $amp$amp$

load $E add $amp$amp$

load $F add $amp$amp$

load $A store A$amp

0lo1 of Control 0lo1 of Control

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 66: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101

0lo1 of Control 0lo1 of Control

efault is seJuential flo

Several instructions alter this defaulteecution

8ranches$ 4nconditional

$ Conditional

$ elayed branches Procedure calls

$ elayed procedure calls

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 67: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101

0lo1 of Control cont-0lo1 of Control cont-

8ranches

4nconditional

$ Absolute address

$ PC$relative

U Target address is specified relative to PC contents U 1elocatable code

ltample6 MIPS

$ Absolute address

9 target

$ PC$relative

8 target

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 68: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101

0lo1 of Control cont- -

e entium e R

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 69: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101

lo1 o Co t ol co t- -

8ranches

Conditional

$ ump is ta0en only if the condition is met

To types

$ Set$Then$ump

U Condition testing is separated from branching U Condition code registers are used to convey the condition test

result

U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition

$ ltample6 Pentium codecm AB comare A ad B

e taret um e0ual

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 70: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101

- -

$ Test$and$ump

U Single instruction performs condition testing and branching

$ ltample6 MIPS instruction

be0 $src$srcamptaret

umps to target if 1src E 1src

elayed branching

Control is transferred after eecuting the instruction thatfollos the branch instruction

$ This instruction slot is called delay slot Improves efficiency

ighly pipelined 1ISC processors support

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 71: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101

- -

Procedure calls Lacilitate modular programming

1eJuire to pieces of information to return

$ ltnd of procedure U Pentium

uses ret instruction

U MIPS

uses 9r instruction

$ 1eturn address U In a (special) register

MIPS allos any general$purpose register

U 5n the stac0

Pentium

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 72: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101

- -

0lo1 of Control cont-0lo1 of Control

cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 73: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101

- -

elay slot

Parameter PassingParameter Passin

g

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 74: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101

gg

To basic techniJues 1egister$based (eg PoerPC MIPS)

$ Internal registers are used U Laster

U 3imit the number of parameters U 1ecursive procedure

Stac0$based (eg Pentium)

$ Stac0 is used U More general

2 perand Types2

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 75: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101

p yp

Instructions support basic data types

Characters Integers

Lloating$point

Instruction overload

Same instruction for different data types

ltample6 Pentium mo1 A2address loads a 3-bt 1alue

mo1 Aaddress loads a -bt 1alue

mo1 EAaddress loads a amp-bt 1alue

perand Types

perand Types

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 76: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101

Separate instructions

Instructions specify the operand si-e

ltample6 MIPS

lb $destaddress loads a b4te

l $destaddress loads a al5ord( bts)

l5 $destaddress loads a 5ord

(amp bts)

ld $destaddress loads a double5ord

( bts)imilar instruction store

3 Addressing Modes3 Addressin

g Modes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 77: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101

o the operands are specified

5perands can be in three places

$ 1egisters U 1egister addressing mode

$ Part of instruction U Constant

U Immediate addressing mode

U All processors support these to addressing modes

$ Memory U ifference beteen 1ISC and CISC

U CISC supports a large variety of addressing modes

U 1ISC follos load2store architecture

4 Instruction Types4 Instruction T

ypes

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 78: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101

Several types of instructions

ata movement$ Pentium6 mo1 destsrc

$ Some do not provide direct data movement instructions

$ Indirect data movement

add $dest$src6 $dest = $src+6

Arithmetic and 3ogical

$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide

$ 3ogical U andB orB notB 7or

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 79: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101

Condition code bits

S6 Sign bit (gt E F E $)

6 Vero bit (gt E non-ero E -ero)

$6 5verflo bit (gt E no overflo E overflo)

C6 Carry bit (gt E no carry E carry)

ltample6 Pentium

cm coutamp comare cout to amp

subtract amp rom cout

e taret um e0ual

Instruction Types cont-Instruction T

ypes cont-

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 80: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101

Llo control and I25 instructions

$ 8ranch

$ Procedure call

$ Interrupts

I25 instructions$ Memory$mapped I25

U Most processors support memory$mapped I25

U 7o separate instructions for I25

$ Isolated I25 U Pentium supports isolated I25

U Separate I25 instructions

Ao7ort read from an IO ort

out o7ortA rte to an IO ort

5 Instruction 0ormats5 Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 81: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101

To types

Lied$length$ 4sed by 1ISC processors

$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC

ariable$length

$ 4sed by CISC processors

$ Memory operands need more bits to specify

5pcode

MaOor and eact operation

Examples of Instruction 0ormatsExam

ples of Instruction 0ormats

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 82: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 83: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101

ISC e)uce) Instruction Set Computer 3

ersus

CISC Comple Instruction Set Computer3

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 84: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101

0

RISC s CISCRISC s CISC

The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute

1ISC systems access memory only ith eplicit loadand store instructions

In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 85: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101

The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6

1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction

CISC systems improve performance by reducing thenumber of instructions per program

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 86: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101

(

The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed

The more comple$$ and variable$$ instruction set of

CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time

Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 87: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101

mo1 a8 6 mo1 b8 6 mo1 c8

Be add a8 b8 loo Be

Consider the the program fragments6

The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles

Dhile the cloc0 cycles for the 1ISC version is6

( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles

Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds

mo1 a8 6 mo1 b8 mul b8 a8

CISC RISC

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 88: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101

8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers

These register provide fast access to data duringseJuential program eecution

They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms

Instead of pulling parameters off of a stac0 the

subprogram is directed to use a subset of registers

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 89: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101

3

This is horegisters canbe overlappedin a 1ISCsystem

The currentindo pointer (CDP) pointsto the activeregister

indo

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 90: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101

34

It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures

Some 1ISC systems provide more etravagantinstruction sets than some CISC systems

Some systems combine both approaches The folloing to slides summari-e the

characteristics that traditionally typify the differencesbeteen these to architectures

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 91: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101

31

RISC Multiple reister sets4

Three operan)s perinstruction4

Parameter passinthrouh reister5in)o5s4

Sinle-ccle

instructions4 7ar)5ire)

control4

7ihl pipeline)4

CISC Sinle reister set4

ne or t5o reisteroperan)s per

instruction4 Parameter passin

throuh memor4

Multiple ccle

instructions4 Microproramme)

control4

(ess pipeline)4ontinued

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 92: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101

32

RISC Simple instructions

fe5 in num9er4

ie) lenth

instructions4 Compleit in

compiler4

nl 29ADT9$E

instructions accessmemor4

e5 a))ressin mo)es4

CISC Man comple

instructions4

aria9le lenth

instructions4 Compleit in

microco)e4

Man instructions can

access memor4

Man a))ressinmo)es4

RISC s CISCRISC s CISC

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 93: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101

RISC s CISCRISC s CISC

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 94: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101

Summar

Instruction Set Design IssuesInstruction Set Desi

gn Issues

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 95: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101

g

Instruction set )esin issues inclu)e here are operan)s store)lt

- reisters memor stac= accumulator

7o5 man eplicit operan)s are therelt

- 0 + 2 or amp

7o5 is the operan) location specifie)lt

- reister imme)iate in)irect 4 4 4

hat tpe gt sie of operan)s are supporte)lt

- 9te int float )ou9le strin ector4 4 4

hat operations are supporte)lt

- a)) su9 mul moe compare 4 4 4

More A+out 6eneral Purpose egistersMore A+out 6eneral Pu

rpose egisters

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 96: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101

h )o almost all ne5 architectures usePslt

eisters are much faster than memor eencache3

- eister alues are aaila9le imme)iatel

- hen memor isnt rea) processor must 5aitBstall3

eisters are conenient for aria9le storae

- Compiler assins some aria9les Dust to reisters

- More compact co)e since small fiel)s specifreisters

compare) to memor a))resses3Registers Cache

MemoryProcessor Disk

7hat perations are eeded7hat

perations are eeded

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 97: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101

3

Arithmetic E (oical

Inteer arithmetic A$$ SU MU(T $I S7IT

(oical operation AN$ NT

$ata Transfer - cop loa) store

Control - 9ranch Dump call return

loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s

$ecimal - A$$$ CNT

Strin - moe compare search

raphics F piel an) erte compressionG)ecompression operations

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 98: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101

Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons

Pros oo) co)e )ensit implicit top of stac=3

(o5 har)5are re1uirements

as to 5rite a simpler compiler for stac= architectures

Cons Stac= 9ecomes the 9ottlenec=

(ittle a9ilit for parallelism or pipelinin

$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)

$ifficult to 5rite an optimiin compiler for stac= architectures

Accumulators Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 99: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101

Accumulators Architecture Pros and Cons

Pros U ery lo hardare reJuirements

U ltasy to design and understand

Cons U Accumulator becomes the bottlenec0

U 3ittle ability for parallelism or pipelining U igh memory traffic

Memory Memory Architecture Pros and Cons

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 100: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101

Memory3Memory Architecture Pros and Cons

Pros U 1eJuires feer instructions (especially if operands)

U ltasy to rite compilers for (especially if operands)

Cons U ery high memory traffic (especially if operands)

U ariable number of cloc0s per instruction

U Dith to operands more data movements are reJuired

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers

Page 101: Chapter 2: Advanced computer Architecture

7232019 Chapter 2 Advanced computer Architecture

httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101

Memory3Register Architecture Pros and Cons

Pros U Some data can be accessed ithout loading first

U Instruction format easy to encode

U ood code density

Cons U 5perands are not eJuivalent (poor orthogonal)

U ariable number of cloc0s per instruction U May limit number of registers