cs.305 computer architecture computer abstractions and technology adapted from computer organization...
TRANSCRIPT
CS.305Computer Architecture
<local.cis.strath.ac.uk/teaching/ug/classes/CS.305>
Computer Abstractions and Technology
Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available by Dr Mary Jane Irwin, Penn State University.
Instructions: Language of the Computer CS305_03/2
Instruction Sets
Language of the Machine
We’ll be working with the MIPS instruction set architecture Similar to other
architecturesdeveloped since the 1980's
Almost 100 million MIPSprocessors manufactured in 2002
Used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, …
Instructions: Language of the Computer CS305_03/3
MIPS is a RISC
RISC - Reduced Instruction Set Computer RISC philosophy
fixed instruction lengths load-store instruction sets limited addressing modes limited operations
MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC, Intel (Compaq) Alpha, …
Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them
Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability,
memory space (embedded systems)
Instructions: Language of the Computer CS305_03/4
MIPS R3000 Instruction Set Architecture (ISA)
Instruction Categories Computational Load/Store Jump and Branch Floating Point
• coprocessor
Memory Management Special
R0 - R31
PCHI
LO
Registers Instruction Formats R - Register I - Immediate J - Jump
Instructions: Language of the Computer CS305_03/5
Three basic formats:
MIPS Instruction Formats
R-format op rs rt rd shamt funct
I-format op rs rt 16-bit address/number
J-format op 26-bit address
Simple instructions - all 32 bits wide Very structured, no unnecessary baggage Rely on compiler to achieve performance
— what are the compiler's goals? [Suggests another version of the acronym RISC ;-)]
Q: Why only three basic formats? A: Design Principle #1…
Instructions: Language of the Computer CS305_03/6
Design Principle #1
Simplicity favours regularity
The fixed-width and limited number of instruction formats keeps the hardware simple One example of this first underlying principle of
hardware design in action
Instructions: Language of the Computer CS305_03/7
Registers vs. Memory
Processor I/O
Control
Datapath
Memory
Input
Output
Arithmetic instructions operands must be registers,
— only 32 registers provided Compiler associates variables with registers What about programs with lots of variables?
Instructions: Language of the Computer CS305_03/8
Memory Organization
Viewed as a large, single-dimension array, with an address.
A memory address is an index into the array "Byte addressing" means that the index points to a
byte of memory.0
1
2
3
4
5
6
...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
Instructions: Language of the Computer CS305_03/9
Memory Organization
Bytes are nice, but most data items use larger "words"
For MIPS, a word is 32 bits or 4 bytes.
232 bytes with byte addresses from 0 to 232-1 230 words with byte addresses 0, 4, 8, ... 232-4 Words are aligned
What are the least 2 significant bits of a word address?
0
4
8
12
...
32 bits of data
32 bits of data
32 bits of data
32 bits of data
Registers hold 32 bits of data
Instructions: Language of the Computer CS305_03/10
Instructions, like registers and words of data, are also 32 bits long Example: add $t1,$s1,$s2 Registers have numbers, $t1=9,$s1=17,$s2=18
Above add's machine language instruction encoding:
Machine Language
000000 10001 10010 01001 00000 100000op rs rt rd shamt funct
Can you guess what the field names, such as 'op', stand for?
Instructions: Language of the Computer CS305_03/11
MIPS Computational Operations
Computational (arithmetic and logical) instructions have 3 operands.Example:
C code: a = b + c
MIPS ‘code’: add a, b, c
(we’ll talk about registers in a bit)
“The natural number of operands for an operation like addition is three…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple”
Instructions: Language of the Computer CS305_03/12
MIPS Arithmetic Instructions
MIPS assembly language arithmetic statement examples:
add $t0,$s1,$s2
sub $t0,$s1,$s2 Each arithmetic instruction performs only one
operation Each arithmetic instruction fits in 32 bits and
specifies exactly three operands
destination source1 op source2 Those operands are all contained in the datapath’s
register file ($t0,$s1,$s2) – indicated by $ Operand order is fixed (destination first in the
assembly language statement)
Instructions: Language of the Computer CS305_03/13
MIPS Arithmetic
Remember "Simplicity favors regularity" Of course this complicates some things...
C code: a = b + c + d;
MIPS 'code': add a, b, cadd a, a, d
Each register contains 32 bits Operands must be registers, but only 32 registers
available Q: Why only 32 registers?
A: Design Principle #2…
Instructions: Language of the Computer CS305_03/14
Design Principle #2
Smaller is Faster Operands of arithmetic instructions cannot be
arbitrary (program) variables; they must come from a limited number of special operands called registers.
One major difference between program variables and registers is the limited number of registers - 32 in MIPS.
A very large number of registers would increase the clock cycle time as electronic signals take longer the further they have to travel. This is one illustration of this second underlying
principle of hardware design
Instructions: Language of the Computer CS305_03/15
Aside: MIPS Register Conventions
NameRegister Number
UsagePreserved on call?
$zero 0 The constant value 0 n.a.
$at 1 Reserved for the assembler n.a.
$v0-$v1 2-3 Values for results and expression evaluation no
$a0-$a3 4-7 arguments no
$t0-$t7 8-15 temporaries no
$s0-$s7 16-23 saved yes
$t8-$t9 24-25 More temporaries no
$k0-$k1 26-27 Reserved for the operating system n.a.
$gp 28 Global pointer yes
$sp 29 Stack pointer yes
$fp 30 Frame pointer yes
$ra 31 Return address yes
Instructions: Language of the Computer CS305_03/16
Aside: MIPS Register File
Holds thirty-two 32-bit registers Two read ports and One write port
Registers are Faster than main memory
• But register files with more locations are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file)
• Read/write port increase impacts speed quadratically
Easier for a compiler to use Convenient places to hold variables
• code density improves (since register are named with fewer bits than a memory location)
write data
Register File
src1 addr
src2 addr
dst addr
32 bits
src1data
src2data
32locations
325
32
5
5
32
write control
Instructions: Language of the Computer CS305_03/17
Recap: R-format Instructions
op opcode that specifies the operation
rs register file address of the first source operand
rt register file address of the second source operand
rd register file address of the result’s destination
shamt shift amount (for shift instructions)
funct function code augmenting the opcode
op rs rt rd shamt funct
6-bits 5-bits 5-bits 5-bits 5-bits 6-bits
Instructions: Language of the Computer CS305_03/18
Register Addressing Mode
The register address fields are rs, rt, and rd. Each field is 5-bits wide
op rs rt rd shamt funct
6-bits 5-bits 5-bits 5-bits 5-bits 6-bits
Registers
Register ($rd)
Register addressing
op rs rt rd s… f…
Register ($rt)Register ($rs)
Instructions: Language of the Computer CS305_03/19
Load and Store Instructions
Example:
C code: A[12] = h + A[8];
MIPS code: lw $t0,32($s3)add $t0,$s2,$t0sw $t0,48($s3)
Destination is last in the store word AL statement Remember arithmetic operands are registers, not
memory!Can’t write: add 48($s3),$s2,32($s3)
Instructions: Language of the Computer CS305_03/20
Our First Example
Can we figure out the code?swap(int v[], int k);{ int temp;
temp = v[k]v[k] = v[k+1];v[k+1] = temp;
} swap:muli $2,$5,4add $2,$4,$2lw $15,0($2)lw $16,4($2)sw $16,0($2)sw $15,4($2)jr $31
Instructions: Language of the Computer CS305_03/21
So far we’ve learned:
MIPS— loading words but addressing bytes— arithmetic on registers only
Instruction Meaning
add $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1
Instructions: Language of the Computer CS305_03/22
Consider load-word and store-word instructions and the design principle Simplicity Favours Regularity…
…so use another (existing) type of (32-bit) instruction format other than R-type: I-type for data transfer instructions
Example: lw $t0,32($s2)
MIPS Load/Store Instruction Format
35 18 9 32
op rs rt 16-bit number/offset
Q: Why only a 16-bit number/offset? A: Design Principle #3…
Instructions: Language of the Computer CS305_03/23
Design Principle #3
Good Design Demands Good Compromises A single (R-type) instruction format is not well suited
to instructions - like lw and sw - that specify address as well as register operands. If the address field was to be allocated to one of the 5-bit fields, say, then such instructions could only address 32 (25) words!
The conflict between having instructions all the same length and the desire to have a single format leads to this third underlying principle of hardware design.
One compromise in MIPS is to have a small number of different fixed-width instruction formats rather than instructions of varying length. Multiple formats do complicate the hardware, but the complexity can be minimised by keeping them similar.
Instructions: Language of the Computer CS305_03/24
MIPS Load/Store Memory Addressing
MIPS has two basic data transfer instructions for accessing memory:lw $t0, 4($s3) # load word from memorysw $t0, 8($s3) # store word to memory
Data is loaded into (lw) or stored from (sw) a register in the register file – a 5 bit address
The memory address – a 32 bit address – is formed by adding the contents of a base address register to an offset value A 16-bit field means access is limited to memory locations
within a region of 213 or 8,192 words (215 or 32,768 bytes) of the address in the base register
Note that the offset can be positive or negative
Instructions: Language of the Computer CS305_03/25
Base (displacement) Addressing Mode
Base (displacement) addressing – operand is at the memory location whose address is the sum of a register and a 16-bit constant contained within the instruction
Memory
Byte/Halfword/Word
Base addressing
op rs rt offset
Register ($rs)
Instructions: Language of the Computer CS305_03/26
Instructions are bits Programs are stored in
memory to be read or written just like
data
Fetch & Execute Cycle Instructions are fetched and put
into a special register Bits in the register "control" the
subsequent actions Fetch the “next” instruction and
continue
Stored Program Concept
Instructions: Language of the Computer CS305_03/27
Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed
MIPS conditional branch instructions:bne $t0,$t1,Label beq $t0,$t1,Label
Example: C MIPS
if (i==j) bne $s0,$s1,Label h = i + j; add $s3,$s0,$s1
Label: ....
Control
Instructions: Language of the Computer CS305_03/28
MIPS unconditional branch instructions:j label
Example:C MIPS
if (i!=j) beq $s4,$s5,Lab1 h=i+j; add $s3,$s4,$s5else j Lab2 h=i-j; Lab1: sub $s3,$s4,$s5
Lab2: ...
Can you build a simple for loop?
Control
Instructions: Language of the Computer CS305_03/29
Recap:
Instruction Meaning
add $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1bne $s4,$s5,Label Next ins. at Label if $s4≠$s5beq $s4,$s5,Label Next ins. at Label if $s4=$s5j Label Next ins. at Label
Formats:
R-format op rs rt rd shamt funct
I-format op rs rt 16-bit address/number
J-format op 26-bit address
Instructions: Language of the Computer CS305_03/30
We have: beq, bne, what about blt (branch-if-less-than)?
New instruction:
slt $t0,$s1,$s2 if $s1 < $s2 then
$t0 = 1else
$t0 = 0
Can use slt to synthesise "blt $s1,$s2,Label" — can now build general control structures
Note that the assembler needs a register to do this, $at
More Control 'Instructions'
Instructions: Language of the Computer CS305_03/31
Constant (immediate) operands are frequently used in programs
e.g., A = A + 4;B = B + 1;C = C - 16;
Possible approaches? put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like 1. have special instructions that contain constants !
Note: small constants are very common (>50% of operands)
Q: Which instruction format(s) to use? A: See Design Principle #4…
Constants
Instructions: Language of the Computer CS305_03/32
Design Principle #4
Make the Common Case Fast Analysis of a large variety of compiled programs
reveal that the vast majority of constants used are quite small numbers: >90% within the range of a 16-bit twos complement integer.
Obvious choice is to use the I-type format for instructions that have as an operand this most common case of constant.
Hence, these typical MIPS 'Immediate' instructions:
addi $sp,$sp,4 #$sp = $sp + 4
slti $t0,$s2,15 #$t0 = 1 if $s2<15
Instructions: Language of the Computer CS305_03/33
There must be a way to 'load' a 32-bit constant into a register. Compromise by using two instructions:
"Load Upper Immediate" (lui) instruction:
lui $t0,1010101010101010b
Zero filled
What about larger constants?
Followed by a "logical or" (ori) instruction:
ori $t0,$t0,1010101010101010b
$t0 1010101010101010 0000000000000000
$t0 1010101010101010 0000000000000000
ori 0000000000000000 1010101010101010
$t0 1010101010101010 1010101010101010
Instructions: Language of the Computer CS305_03/34
Assembly provides convenient symbolic representation much easier than writing down numbers e.g., destination first
Machine language is the underlying reality e.g., destination is no longer first
Assembly can provide 'pseudoinstructions' e.g., “move $t0,$t1” exists only in Assembly would be implemented by “add $t0,$t1,$zero”
When considering performance you should count real instructions
Assembly Language vs. Machine Language
Instructions: Language of the Computer CS305_03/35
Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4≠$t5beq $t4,$t5,Label Next instruction is at Label if $t4=$t5j Label Next instruction is at Label
Formats:
Addresses in Branches and Jumps
I-format op rs rt 16-bit address
J-format op 26-bit address
Addresses are not 32 bits How do we handle this with load and store
instructions?
Instructions: Language of the Computer CS305_03/36
Could specify a register (like lw and sw did) and add it to address (offset). Q: Which register? A: Instruction Address Register (aka Program Counter - PC)
Addresses in Branches
Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4≠$t5beq $t4,$t5,Label Next instruction is at Label if $t4=$t5
Format:I-format op rs rt 16-bit address (offset)
PC-Relative addressing
op rs rt offset
Program Counter (PC)
??
?
Instructions: Language of the Computer CS305_03/37
Specifying Branch Destinations
Why PC? its use is automatically implied by instruction
• PC gets updated (PC+4) during the fetch cycle so that it holds the address of the next instruction
limits the branch distance to -215 to +215-1 instructions from the (instruction after the) branch instruction, but most branches are local anyway. (Principle of Locality).
PCAdd
32
32 3232
32
offset
16
32
00
sign-extend
from the low order 16 bits of the branch instruction
branch dstaddress
?Add
4 32
Instructions: Language of the Computer CS305_03/38
Jump instructions just use high order bits of PC A compromise: such jumps are limited by address
boundaries of 256 MB, i.e within blocks of 226 instructions.
Addresses in Jumps
Instruction:j Label Next instruction is at Label
Format:J-format op 26-bit address
PC4
32
26
32
00
from the low order 26 bits of the jump instruction
Instructions: Language of the Computer CS305_03/39
MIPS ISA So Far
Category Instr Op Code Example Meaning
Arithmetic(R & I format)
add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3
subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3
add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6
or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6
Data Transfer(I format)
load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24)
store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1
load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25)
store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1
load upper imm 15 lui $s1, 6 $s1 = 6 * 216
Cond. Branch (I & R format)
br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L
br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L
set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0
set on less than immediate
10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0
Uncond. Jump (J & R format)
jump 2 j 2500 go to 10000
jump register 0 and 8 jr $t1 go to $t1
jump and link 3 jal 2500 go to 10000; $ra=PC+4
Instructions: Language of the Computer CS305_03/40
Review of MIPS Operand Addressing Modes
Register addressing – operand is in a register
Base (displacement) addressing – operand is at the memory location whose address is the sum of a register and a 16-bit constant contained within the instruction
Register relative (indirect) with 0($a0) Pseudo-direct with addr($zero)
Immediate addressing – operand is a 16-bit constant contained within the instruction
op rs rt rd funct Register
word operand
base register
op rs rt offset Memory
word or byte operand
op rs rt operand
Instructions: Language of the Computer CS305_03/41
Review of MIPS Instruction Addressing Modes
PC-relative addressing –instruction address is the sum of the PC and a 16-bit constant contained within the instruction
Pseudo-direct addressing – instruction address is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC
op rs rt offset
Program Counter (PC)
Memory
branch destination instruction
op jump address
Program Counter (PC)
Memory
jump destination instruction||
Instructions: Language of the Computer CS305_03/42
Instructions for Accessing Procedures
MIPS 'procedure call' instruction:jal ProcedureAddress #jump and link
Saves PC+4 in register $ra ($31) to have a link to the next instruction for the procedure return
Instruction format (J-format):
Then can do procedure 'return' with a
jr $ra #return Instruction format (R-format):
jal 000011 26-bit address
jr 00000 rs 001000
op rs rt rd shamt funct
Instructions: Language of the Computer CS305_03/43
Aside: Spilling Registers
One of the general registers, $sp, is used to address the stack (which “grows” from high address to low address) Push (a register onto the stack):subi $sp,$sp,4sw $ra,0($sp)
Pop (a register off the stack): lw $ra,0($sp)addi $sp,$sp,4
low addr
high addr
$sptop of stack
What if the callee needs more registers and/or the procedure is recursive? use a stack – a last-in-first-out queue – in memory
for passing additional values or saving (recursive) return address(es)
Instructions: Language of the Computer CS305_03/44
Example: Nested Procedure Calls - MIPS code
A: ......jal B # Call B, save return addr in $ra...
B: ...... # Get ready to call Csubi $sp,$sp,4 # Adjust ToS to make room to...sw $ra,0($sp) # ...'push' the old return addrjal C # Call C, save return addr in $31lw $ra,0($sp) # Restore B's return address...addi $sp,$sp,4 # ...and re-adjust ToS ('pop')...jr $ra # Return to proc that called B
C: ......jr $ra # Return to proc that called C
Instructions: Language of the Computer CS305_03/45
Passing Parameters to Procedures
Conventions for passing parameters - arguments - may vary from machine to machine, language to language, and even compiler to compiler.
MIPS uses $4 to $7 ($a0-$a3) as arguments. There must also be a convention for preserving
registers across procedure calls. The two usual conventions are: Caller save. The calling procedure (caller) has the
responsibility for preserving affected registers. The called procedure (callee) can then modify any registers without constraint.
Callee save. The callee has the responsibility for saving and restoring any registers that it might use. The calling procedure (caller) uses registers without worrying about their preservation.
Instructions: Language of the Computer CS305_03/46
MIPS Pseudoinstructions
In keeping with design principles the MIPS ISA does not contain complex instructions as these could compromise the performance of all instructions.
However, a MIPS compiler/assembler can synthesise 'pseudoinstructions' from common variations of real instructions. Such pseudoinstructions simplify translation and programming.
Pseudoinstructions give MIPS a richer set of assembly language instructions than those implemented by hardware.
The assembler reserves one register, $at, that is used in the synthesis of many pseudoinstructions.
For example…
Instructions: Language of the Computer CS305_03/47
Example MIPS Pseudoinstructions
Pseudoinstruction Real MIPS move $t0,$t1 add $t0,$t1,$zero
clear $s0 add $s0,$zero,$zero
blt $s1,$s2,label slt $at,$s1,$s2bne $at,
$zero,label
bge $s1,$s2,label slt $at,$s1,$s2beq $at,
$zero,label
Instructions: Language of the Computer CS305_03/48
Summary of MIPS (RISC) Design Principles
Simplicity favors regularity fixed size instructions – 32-bits small number of instruction formats opcode always the first 6 bits
Good design demands good compromises three instruction formats
Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes
Make the common case fast arithmetic operands from the register file (load-store
machine) allow instructions to contain immediate operands
Instructions: Language of the Computer CS305_03/49
Fallacies and Pitfalls
Fallacy: More powerful instructions mean higher performance. Such instructions often do more work than is required in the
frequent case or don't match the requirements of the language.
Pitfall: To obtain the highest performance, write in assembly language. The increasing sophistication of modern compilers means
that the gap between compiled code and 'hand-crafted' code is closing fast.
Even if the gap isn't closed completely, the drawbacks of writing in assembly language are longer time spent coding and debugging, the loss in portability, and difficulty of maintenance.
Pitfall: Forgetting that sequential word addresses in memory differ by 4, not by 1.