chapter - university of isfahanengold.ui.ac.ir/~nikmehr/chapter_03_animated.pdf< 0 ≥ 0 a - b...
TRANSCRIPT
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface
5th Edition
Chapter 3 Arithmetic for Computers
Chapter 3 — Arithmetic for Computers — 2
Arithmetic for Computers Operations on integers
Addition and subtraction Multiplication and division Dealing with overflow
Floating-point real numbers Representation and operations
§3.1 Introduction
Chapter 3 — Arithmetic for Computers — 3
Integer Addition Example: 7 + 6
§3.2 Addition and Subtraction
Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands
Overflow if result sign is 1 Adding two –ve operands
Overflow if result sign is 0
Chapter 3 — Arithmetic for Computers — 4
Integer Subtraction Add negation of second operand Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111 –6: 1111 1111 … 1111 1010 +1: 0000 0000 … 0000 0001
Overflow if result out of range Subtracting two +ve or two –ve operands, no overflow Subtracting +ve from –ve operand
Overflow if result sign is 0 Subtracting –ve from +ve operand
Overflow if result sign is 1
Dealing with Overflow
Operation Operand A Operand B Result indicating overflow
A + B ≥ 0 ≥ 0 < 0 A + B < 0 < 0 ≥ 0 A - B ≥ 0 < 0 < 0 A - B < 0 ≥ 0 ≥ 0
Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit
When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur
Dealing with Overflow
Operation Operand A Operand B Result indicating overflow
A + B ≥ 0 ≥ 0 < 0 A + B < 0 < 0 ≥ 0 A - B ≥ 0 < 0 < 0 A - B < 0 ≥ 0 ≥ 0
Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit
When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur
MIPS signals overflow with an exception (aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception
2’s Comp - Detecting Overflow When adding two's complement numbers, overflow will only
occur if the numbers being added have the same sign the sign of the result is different
If we perform the addition an-1 an-2 ... a1 a0
+bn-1bn-2… b1 b0 ----------------------------------
sn-1sn-2… s1 s0
Overflow can be detected as
Overflow can also be detected as
where cn-1and cn are the carry in and carry out of the most significant bit
111111 −⋅−⋅−+−⋅−⋅−= nnnnnn sbasbaV
1−⊗= nn ccV
Unsigned - Detecting Overflow For unsigned numbers, overflow occurs if there is carry out
of the most significant bit. For example, 1001 = 9 + 1000 = 8 0001 = 1 With the MIPS architecture
Overflow exceptions occur for two’s complement arithmetic add, sub, addi
Overflow exceptions do not occur for unsigned arithmetic addu, subu, addiu
ncV =
Chapter 3 — Arithmetic for Computers — 8
Dealing with Overflow Some languages (e.g., C) ignore overflow
Use MIPS addu, addui, subu instructions
Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler
Save PC in exception program counter (EPC) register Jump to predefined handler address mfc0 (move from coprocessor reg) instruction can
retrieve EPC value, to return after corrective action
MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic
operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu
32
32
32
m (operation)
result
A
B
ALU
4
zero ovf
1 1
MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic
operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu
32
32
32
m (operation)
result
A
B
ALU
4
zero ovf
1 1
With special handling for sign extend – addi, addiu, slti, sltiu zero extend – andi, ori, xori overflow detection – add, addi, sub
Chapter 3 — Arithmetic for Computers — 10
Arithmetic for Multimedia Graphics and media processing operates on
vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain
Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors SIMD (single-instruction, multiple-data)
Saturating operations On overflow, result is largest representable value eg., clipping in audio, saturation in video
Multiply Binary multiplication is just a bunch of right
shifts and adds
multiplicand
multiplier
partial product array
double precision product
n
2n
n
Multiply Binary multiplication is just a bunch of right
shifts and adds
multiplicand
multiplier
partial product array
double precision product
n
2n
n can be formed in parallel and added in parallel for faster multiplication
Chapter 3 — Arithmetic for Computers — 12
Multiplication Start with long-multiplication approach
1000 × 1001 1000 0000 0000 1000 1001000
Length of product is the sum of operand lengths
multiplicand
multiplier
product
§3.3 Multiplication
Chapter 3 — Arithmetic for Computers — 13
Multiplication Hardware
Initially 0
Multiplicand is zero extended (unsigned mul)
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
00001100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
00001100 Multiplier[0] = 0 => Do Nothing 00011000 0110 SLL Multiplicand and SRL Multiplier
00001100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier
00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier
00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +
00111100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
0001 01100000 SLL Multiplicand and SRL Multiplier
00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier
00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +
00111100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
0001 01100000 SLL Multiplicand and SRL Multiplier
00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier
00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +
00111100 Multiplier[0] = 1 => ADD +
10011100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned
0001 01100000 SLL Multiplicand and SRL Multiplier
00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier
00011000 0110 SLL Multiplicand and SRL Multiplier
0000 11000000 SLL Multiplicand and SRL Multiplier
00001100 Multiplier[0] = 1 => ADD +
00111100 Multiplier[0] = 1 => ADD +
10011100 Multiplier[0] = 1 => ADD +
2
00001100 00000000 1101 Initialize 0
1
3
4
Multiplicand Product Multiplier Iteration
Observation on Version 1 of Multiply Hardware in version 1 can be optimized Rather than shifting the multiplicand to the
left Instead, shift the product to the right Has same net effect and produces same results
Reduce Hardware Multiplicand register can be reduced to 32 bits We can also reduce the adder size to 32 bits
One cycle per iteration Shifting and addition can be done
simultaneously
Product = HI and LO registers, HI=0 Product is shifted right Reduced 32-bit Multiplicand & Adder
Version 2 of Multiplication Hardware
= 0
Start
Multiplier[0]?
HI = HI + Multiplicand
32nd Repetition?
Done
= 1
No
Yes
Shift Product = (HI,LO) Right 1 bit Shift Multiplier Right 1 bit
32-bit ALU
Control 64 bits
32 bits
write
add
Multiplicand
shift right
32 bits
33 bits
HI LO
32 bits carry
Multiplier
shift right
Multiplier[0] 32 bits
Eliminate Multiplier Register Initialize LO = Multiplier, HI=0 Product = HI and LO registers
Refined Version of Multiply Hardware
= 0
Start
LO[0]?
HI = HI + Multiplicand
32nd Repetition?
Done
= 1
No
Yes
LO=Multiplier, HI=0
Shift Product = (HI,LO) Right 1 bit 32-bit ALU
Control 64 bits
32 bits
write
add
LO[0]
Multiplicand
shift right
32 bits
33 bits
HI LO
32 bits carry
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +
1 0 0 1 1 1 0 0 1 LO[0] = 1 => ADD +
Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1
LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1
1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0
1 1 0 0 Shift Right Product = (HI, LO) 1 0 0 1 1 1 0 0
2
1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0
1
3
4
Multiplicand Product = HI, LO Carry Iteration
0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +
0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +
1 0 0 1 1 1 0 0 1 LO[0] = 1 => ADD +
Chapter 3 — Arithmetic for Computers — 20
Faster Multiplier Uses multiple adders
Cost/performance tradeoff
Can be pipelined Several multiplication performed in parallel
Chapter 3 — Arithmetic for Computers — 21
MIPS Multiplication Two 32-bit registers for product
HI: most-significant 32 bits LO: least-significant 32-bits
Instructions mult rs, rt / multu rs, rt
64-bit product in HI/LO mfhi rd / mflo rd
Move from HI/LO to rd Can test HI value to see if product overflows 32 bits
mul rd, rs, rt
Least-significant 32 bits of product –> rd
Unsigned Division (Paper & Pencil)
=
11
Chapter 3 — Arithmetic for Computers — 23
Division Check for 0 divisor Long division approach
If divisor ≤ dividend bits 1 bit in quotient, subtract
Otherwise 0 bit in quotient, bring down next
dividend bit Restoring division
Do the subtract, and if remainder goes < 0, add divisor back
Signed division Divide using absolute values Adjust sign of quotient and remainder
as required
1001 1000 1001010 -1000 10 101 1010 -1000 10
n-bit operands yield n-bit quotient and remainder
quotient
dividend
remainder
divisor
§3.4 Division
Chapter 3 — Arithmetic for Computers — 24
Division Hardware
Initially dividend
Initially divisor in left half
Division Example
Dividend = 0111 Divisor = 0010
Optimized Divider Uses two registers: HI and LO Initialize: HI = Remainder = 0 and LO = Dividend Shift (HI, LO) LEFT by 1 bit (also Shift Quotient LEFT)
Shift the remainder and dividend registers together LEFT
Compute: Difference = Remainder – Divisor If (Difference ≥ 0) then
Remainder = Difference Set Least significant Bit of Quotient
Observation to Reduce Hardware: LO register can be also used to store the computed
Quotient
Chapter 3 — Arithmetic for Computers — 27
Optimized Divider Initialize:
HI = 0, LO = Dividend
Results: HI = Remainder LO = Quotient
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0
2: Diff < 0 => Do Nothing
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0
2: Diff < 0 => Do Nothing
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1
1 1 1 1 1: Shift Left, Diff = HI - Divisor 0 0 1 0 0 1 0 0 0 0 1 1
2: Diff < 0 => Do Nothing
Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)
2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0
2: Diff < 0 => Do Nothing
2: Diff < 0 => Do Nothing
2
0 0 0 0 1 1 1 0 Initialize 0
1
3
4
HI Difference LO Iteration 0 0 1 1 Divisor
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1
0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1
1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1
1 1 1 1 1: Shift Left, Diff = HI - Divisor 0 0 1 0 0 1 0 0 0 0 1 1
Chapter 3 — Arithmetic for Computers — 29
Combined Multiplier and Divider
One cycle per partial-remainder subtraction Looks a lot like a multiplier!
Same hardware can be used for both
Chapter 3 — Arithmetic for Computers — 30
Faster Division Can’t use parallel hardware as in multiplier
Subtraction is conditional on sign of remainder Faster dividers (e.g. SRT devision)
generate multiple quotient bits per step Still require multiple steps
Chapter 3 — Arithmetic for Computers — 31
MIPS Division Use HI/LO registers for result
HI: 32-bit remainder LO: 32-bit quotient
Instructions div rs, rt / divu rs, rt
No overflow or divide-by-0 checking Software must perform checks if required
Use mfhi, mflo to access result
Representing Big (and Small) Numbers What if we want to encode
approx. age of the earth? 4,600,000,000 or 4.6 x 109
weight in kg of one a.m.u. (atomic mass unit) 0.0000000000000000000000000166 or 1.6 x 10-27
There is no way we can encode either of them in a 32-bit integer
Floating point representation (-1)sign x F x 2E
Still have to fit everything in 32 bits (single precision)
Chapter 3 — Arithmetic for Computers — 33
Floating Point Representation for non-integral numbers
Including very small and very large numbers Like scientific notation
–2.34 × 1056 +0.002 × 10–4 +987.02 × 109
In binary ±1.xxxxxxx2 × 2yyyy
Types float and double in C
normalized
not normalized
§3.5 Floating Point
Chapter 3 — Arithmetic for Computers — 34
Floating Point Standard Defined by IEEE Std 754-1985 Developed in response to divergence of
representations Portability issues for scientific code
Now almost universally adopted Two representations
Single precision (32-bit) Double precision (64-bit)
Chapter 3 — Arithmetic for Computers — 35
IEEE Floating-Point Format
S: sign bit (0 ⇒ non-negative, 1 ⇒ negative) Normalized significand: 1.0 ≤ |significand| < 2.0
Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)
Significand is Fraction with the “1.” restored
Exponent: excess representation: actual exponent + Bias Ensures exponent is unsigned Single: Bias = 127; Double: Bias = 1023
S Exponent Fraction
single: 8 bits double: 11 bits
single: 23 bits double: 52 bits
Bias)(ExponentS 2Fraction)(11)(x −×+×−=
Chapter 3 — Arithmetic for Computers — 36
IEEE Floating-Point Format
All 0
All 1
All 0
All 1
All 0
All 1
All 0
All 1
Chapter 3 — Arithmetic for Computers — 37
Single-Precision Range Exponents 00000000 and 11111111 reserved Smallest value
Exponent: 00000001 ⇒ actual exponent = 1 – 127 = –126
Fraction: 000…00 ⇒ significand = 1.0 ±1.0 × 2–126 ≈ ±1.2 × 10–38
Largest value exponent: 11111110 ⇒ actual exponent = 254 – 127 = +127
Fraction: 111…11 ⇒ significand ≈ 2.0 ±2.0 × 2+127 ≈ ±3.4 × 10+38
Chapter 3 — Arithmetic for Computers — 38
Double-Precision Range Exponents 0000…00 and 1111…11 reserved Smallest value
Exponent: 00000000001 ⇒ actual exponent = 1 – 1023 = –1022
Fraction: 000…00 ⇒ significand = 1.0 ±1.0 × 2–1022 ≈ ±2.2 × 10–308
Largest value Exponent: 11111111110 ⇒ actual exponent = 2046 – 1023 = +1023
Fraction: 111…11 ⇒ significand ≈ 2.0 ±2.0 × 2+1023 ≈ ±1.8 × 10+308
Chapter 3 — Arithmetic for Computers — 39
Floating-Point Precision Relative precision
all fraction bits are significant Single: approx 2–23
Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision
Double: approx 2–52
Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision
Chapter 3 — Arithmetic for Computers — 40
Floating-Point Example Represent –0.75
–0.75 = (–1)1 × 1.12 × 2–1
S = 1 Fraction = 1000…002 Exponent = –1 + Bias
Single: –1 + 127 = 126 = 011111102 Double: –1 + 1023 = 1022 = 011111111102
Single: 1011111101000…00 Double: 1011111111101000…00
Chapter 3 — Arithmetic for Computers — 41
Floating-Point Example What number is represented by the single-
precision float 11000000101000…00
S = 1 Fraction = 01000…002 Exponent = 100000012 = 129
x = (–1)1 × (1 + 0.012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0
Chapter 3 — Arithmetic for Computers — 42
Representable FP Numbers
Denser Denser Sparser Sparser
Negative numbers FLP FLP ±0 +∞
–∞
Overflow region
Overflow region
Underflow regions
Positive numbers
Underflow example
Overflow example
Midway example
Typical example
min max min max + + – – – +
+1.0×2–126 -1.0×2–126 +2.0×2+127 -2.0×2+127
Chapter 3 — Arithmetic for Computers — 43
Zero Biased exponent=0 Fraction = 0 Hidden bit = 0.
Two representations for 0 (single precision)
+0.0 0x00000000
-0.0 0x80000000
0.020)(01)(x BiasS ±=×+×−= −
Chapter 3 — Arithmetic for Computers — 44
Denormal Numbers Exponent = 000...0 ⇒ hidden bit is 0
Smaller than normal numbers allow for gradual underflow, with diminishing
precision
1)BiasS 2Fraction)(01)(x −−×+×−= (
Denormalized Example Find the decimal equivalent of IEEE 754 number
0 00000000 11011000000000000000000 Determine the sign: 0, the number is positive Calculate the exponent
exponent of all 0s means the number is denormalized exponent is therefore -126
Convert the significand to decimal remember to add the implicit bit (this is 0. because the
number is denormalized) (0.11011)2 = (0.84375)10
0 00000000 11011000000000000000000 = 0.84375 x 2-126
Chapter 3 — Arithmetic for Computers — 45
Chapter 3 — Arithmetic for Computers — 46
Infinities and NaNs Exponent = 111...1, Fraction = 000...0
±Infinity Can be used in subsequent calculations,
avoiding need for overflow check Exponent = 111...1, Fraction ≠ 000...0
Not-a-Number (NaN) Indicates illegal or undefined result
e.g., 0.0 / 0.0 Can be used in subsequent calculations
Chapter 3 — Arithmetic for Computers — 47
NaNs 1. Operations with at least one NaN operand 2. Indeterminate forms
divisions 0/0 and ±∞/±∞ multiplications 0×±∞ and ±∞×0 additions ∞ + (−∞), (−∞) + ∞ and equivalent
subtractions 3. Real operations with complex results
square root of a negative number. logarithm of a negative number inverse sine or cosine of a number that is less than
−1 or greater than +1.
Chapter 3 — Arithmetic for Computers — 48
Floating-Point Addition Consider a 4-digit decimal example
9.999 × 101 + 1.610 × 10–1
1. Align decimal points Shift number with smaller exponent 9.999 × 101 + 0.016 × 101
2. Add significands 9.999 × 101 + 0.016 × 101 = 10.015 × 101
3. Normalize result & check for over/underflow 1.0015 × 102
4. Round and renormalize if necessary 1.002 × 102
Chapter 3 — Arithmetic for Computers — 49
Floating-Point Addition Now consider a 4-digit binary example
1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375) 1. Align binary points
Shift number with smaller exponent 1.0002 × 2–1 + –0.1112 × 2–1
2. Add significands 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
3. Normalize result & check for over/underflow 1.0002 × 2–4, with no over/underflow
4. Round and renormalize if necessary 1.0002 × 2–4 (no change) = 0.0625
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Shift the significand of the lesser exponent right until its exponent matches the larger number
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1
Difference between the two exponents = –1 – (–3) = 2
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1
Difference between the two exponents = –1 – (–3) = 2
So, shift right by 2 bits
Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
Exponents are not equal
Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1
Difference between the two exponents = –1 – (–3) = 2
So, shift right by 2 bits Now, add the significands:
Carry
1.111 0.01011
10.00111
+
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
In this example, we have a carry
So, shift right by 1 bit and increment the exponent
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
In this example, we have a carry
So, shift right by 1 bit and increment the exponent
Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
In this example, we have a carry
So, shift right by 1 bit and increment the exponent
Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.000111)2 ≈ (1.001)2
1.000 111 1
1.001
+
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
In this example, we have a carry
So, shift right by 1 bit and increment the exponent
Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.000111)2 ≈ (1.001)2
Renormalize if rounding generates a carry
1.000 111 1
1.001
+
Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
However, result (10.00111)2 × 2–1 is NOT normalized
Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
In this example, we have a carry
So, shift right by 1 bit and increment the exponent
Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.000111)2 ≈ (1.001)2
Renormalize if rounding generates a carry
Detect overflow / underflow
If exponent becomes too large (overflow) or too small (underflow) Represent overflow/underflow accordingly
1.000 111 1
1.001
+
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
+ 0.00001 × 22
– 1.00000 × 22
Sign
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
+ 0.00001 × 22
– 1.00000 × 22
0 0.00001 × 22
1 1.00000 × 22
Sign
2’s
Com
plem
ent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
+ 0.00001 × 22
– 1.00000 × 22
0 0.00001 × 22
1 1.00000 × 22
1 1.00001 × 22
Sign
2’s
Com
plem
ent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
+ 0.00001 × 22
– 1.00000 × 22
0 0.00001 × 22
1 1.00000 × 22
1 1.00001 × 22
Sign
Since result is negative, convert result from 2's complement to sign-magnitude
2’s
Com
plem
ent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
Convert subtraction into addition to 2's complement
+ 0.00001 × 22
– 1.00000 × 22
0 0.00001 × 22
1 1.00000 × 22
1 1.00001 × 22
Sign
Since result is negative, convert result from 2's complement to sign-magnitude
2’s
Com
plem
ent
– 0.11111 × 22 2’s Complement
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.1111)2 ≈ (10.000)2
1.111 1 1
10.000
+
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.1111)2 ≈ (10.000)2 Renormalize: rounding generated a carry –1.11112 × 21 ≈ –10.0002 × 21 = –1.0002 × 22
1.111 1 1
10.000
+
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits
We assumed 4 bits of precision or 3 bits of fraction
Round to nearest: (1.1111)2 ≈ (10.000)2 Renormalize: rounding generated a carry –1.11112 × 21 ≈ –10.0002 × 21 = –1.0002 × 22
Result would have been accurate if more fraction bits are used
1.111 1 1
10.000
+
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3
Convert subtraction into addition to 2's complement
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3
Convert subtraction into addition to 2's complement
+ 1.0000 × 2-3
– 0.1100 × 2-3
Sign
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3
Convert subtraction into addition to 2's complement
+ 1.0000 × 2-3
– 0.1100 × 2-3
0 1.0000 × 2-3
1 1.0100 × 2-3
Sign
2’s
Com
plem
ent
Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4
We assume again: 4 bits of precision (or 3 bits of fraction)
Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3
Convert subtraction into addition to 2's complement
+ 1.0000 × 2-3
– 0.1100 × 2-3
0 1.0000 × 2-3
1 1.0100 × 2-3
0 0.0100 × 2-3
Sign
2’s
Com
plem
ent
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5
For subtraction, we can have leading zeros
Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5
For subtraction, we can have leading zeros
Round the significand to fit in appropriate number of bits We assumed 4 bits of precision or 3 bits of fraction
Answer: 1.0002 × 2-5
Chapter 3 — Arithmetic for Computers — 57
FP Adder Hardware Much more complex than integer adder Doing it in one clock cycle would take too
long Much longer than integer operations Slower clock would penalize all instructions
FP adder usually takes several cycles Can be pipelined
Chapter 3 — Arithmetic for Computers — 58
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
Chapter 3 — Arithmetic for Computers — 59
FP Adder Hardware
Chapter 3 — Arithmetic for Computers — 61
Floating-Point Multiplication Consider a 4-digit decimal example
1.110 × 1010 × 9.200 × 10–5
1. Add exponents For biased exponents, subtract bias from sum New exponent = 10 + –5 = 5
2. Multiply significands 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105
3. Normalize result & check for over/underflow 1.0212 × 106
4. Round and renormalize if necessary 1.021 × 106
5. Determine sign of result from signs of operands +1.021 × 106
Chapter 3 — Arithmetic for Computers — 62
Floating-Point Multiplication Now consider a 4-digit binary example
1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375) 1. Add exponents
Unbiased: –1 + –2 = –3 Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
2. Multiply significands 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3
3. Normalize result & check for over/underflow 1.1102 × 2–3 (no change) with no over/underflow
4. Round and renormalize if necessary 1.1102 × 2–3 (no change)
5. Determine sign: +ve × –ve ⇒ –ve –1.1102 × 2–3 = –0.21875
Chapter 3 — Arithmetic for Computers — 63
FP Arithmetic Hardware FP multiplier is of similar complexity to FP
adder But uses a multiplier for significands instead of
an adder FP arithmetic hardware usually does
Addition, subtraction, multiplication, division, reciprocal, square-root
FP ↔ integer conversion Operations usually takes several cycles
Can be pipelined
Chapter 3 — Arithmetic for Computers — 65
FP Instructions in MIPS FP hardware is coprocessor 1
Adjunct processor that extends the ISA Separate FP registers
32 single-precision: $f0, $f1, … $f31 Paired for double-precision: $f0/$f1, $f2/$f3, …
Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s
FP instructions operate only on FP registers Programs generally don’t do integer ops on FP data, or
vice versa More registers with minimal code-size impact
FP load and store instructions lwc1, ldc1, swc1, sdc1
e.g., ldc1 $f8, 32($sp)
Chapter 3 — Arithmetic for Computers — 66
FP Instructions in MIPS Single-precision arithmetic
add.s, sub.s, mul.s, div.s add.s $f0, $f1, $f6 #$f0 = $f1 + $f6
Double-precision arithmetic add.d, sub.d, mul.d, div.d
add.d $f0, $f2, $f6 #$f0:$f1 = $f2:$f3 + $f6:$f7
mul.d $f4, $f4, $f6
Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, …) Sets or clears FP condition-code bit (0 to 7)
c.lt.s $f3, $f4 #if($f3 < $f4) cond=1; else cond=0
Branch on FP condition code true or false bc1t, bc1f
bc1t TargetLabel #if(cond==1) go to TargetLabel
Chapter 3 — Arithmetic for Computers — 67
FP Example: °F to °C C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); }
fahr in $f12, result in $f0, literals in global memory space
Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc1 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra
Chapter 3 — Arithmetic for Computers — 71
Accurate Arithmetic IEEE Std 754 specifies additional rounding
control Extra bits of precision (guard, round, sticky) Choice of rounding modes Allows programmer to fine-tune numerical behavior of
a computation Not all FP units implement all options
Most programming languages and FP libraries just use defaults
Trade-off between hardware complexity, performance, and market requirements
Support for Accurate Arithmetic
IEEE 754 FP rounding modes Always round up (toward +∞) Always round down (toward -∞) Truncate (toward zero) Round to nearest even (RTNE)
Support for Accurate Arithmetic
Rounding (except for truncation) requires the hardware to include extra F bits during calculations Guard bit – used to provide one F bit when shifting left to normalize
a result (e.g., when normalizing F after division or subtraction) Round bit – used to improve rounding accuracy Sticky bit – used to support Round to nearest even; is set to a 1
whenever a 1 bit shifts (right) through it (e.g., when aligning F during addition/subtraction) sticky OR of the discarded bits
IEEE 754 FP rounding modes Always round up (toward +∞) Always round down (toward -∞) Truncate (toward zero) Round to nearest even (RTNE)
F = 1 . xxxxxxxxxxxxxxxxxxxxxxx G R S
Rounding Examples
Chapter 3 — Arithmetic for Computers — 73
Number To +∞ To −∞ To Zero +1.000101 +1.0010 +1.0001 +1.0001 - 1.000111 - 1.0001 - 1.0010 - 1.0001 +1.010110 +1.0110 +1.0101 +1.0101 +1.010010 +1.0101 +1.0100 +1.0100 - 1.001110 - 1.0011 - 1.0100 - 1.0011
RTNE Examples GRS - Action (after first normalization – if required) 0xx - round down = do nothing (x means any bit value, 0 or 1) 100 - this is a tie: round up if the bit just before G is 1, else
round down = do nothing 101 - round up (increment by 1) 110 - round up (increment by 1) 111 - round up (increment by 1) Sig GRS Increment? Rounded 1.000000 N 1.000 1.101000 N 1.101 1.000100 N 1.000 1.001100 Y 1.010 1.000101 Y 1.001 1.111110 Y 10.000
Chapter 3 — Arithmetic for Computers — 74
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 100110 × 25 (2's complement)
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 100110 × 25 (2's complement)
Guard bit – do not discard
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 100110 × 25 (2's complement)
0 0.11111110101100010001011 0 100110 × 25 (add significands)
Guard bit – do not discard
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 100110 × 25 (2's complement)
0 0.11111110101100010001011 0 100110 × 25 (add significands)
+ 1.11111101011000100010110 1 001100 × 24 (normalized)
Guard bit – do not discard
Guard Bit Guard bit: guards against loss of a significant bit
Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit
Example on the need of a guard bit:
1.00000000101100010001101 × 25
– 1.00000000000000011011010 × 2-2 (subtraction)
1.00000000101100010001101 × 25
– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 100110 × 25 (2's complement)
0 0.11111110101100010001011 0 100110 × 25 (add significands)
+ 1.11111101011000100010110 1 001100 × 24 (normalized)
Guard bit – do not discard
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized)
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized)
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit Sticky bit
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit Sticky bit
OR-reduce
Guard bit
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized)
Round bit Sticky bit
OR-reduce
Guard bit
Guard bit
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized)
Round bit Sticky bit
OR-reduce
Guard bit
Guard bit
Round and Sticky Bits Two extra bits are needed for rounding
Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)
Consider the same example of previous slide:
0 1.00000000101100010001101 × 25
1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)
0 0.11111110101100010001011 0 1 00110 × 25 (sum)
+ 1.11111101011000100010110 1 0 1 × 24 (normalized)
Round bit Sticky bit
OR-reduce
Guard bit
Guard bit