csn221_lec_10.pdf
TRANSCRIPT
![Page 1: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/1.jpg)
Course Website: http://faculty.iitr.ac.in/~sudiproy.fcs/csn221_2015.htmlPiazza Site: https://piazza.com/iitr.ac.in/fall2015/csn221
Dr. Sudip Roy
CSN‐221: COMPUTER ARCHITECTURE AND MICROPROCESSORS
Computer Arithmetic
(Lecture - 10)
![Page 2: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/2.jpg)
Dr. Sudip Roy 2
Real Numbers:
![Page 3: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/3.jpg)
Dr. Sudip Roy 3
Floating‐Point Number Formats:The term floating point number refers to representation of real binary numbers in computers
IEEE 754 standard defines standards for floating point representations
General format±1.bbbbbtwo×2eeee
or (‐1)S × (1+F) × 2E
Where S = sign, 0 for positive, 1 for negative F = fraction (or mantissa) as a binary integer, 1+F is called
significand E = exponent as a binary integer, positive or negative (two’s
complement)
![Page 4: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/4.jpg)
Dr. Sudip Roy 4
MIPS Single Precision:
‐127 ≤ E ≤ 128, Max |E| ~ 128 Overflow: Exponent requiring more than 8 bits. Number can be positive or negative.
Underflow: Fraction requiring more than 23 bits. Number can be positive or negative.
S E: 8‐bit Exponent F: 23‐bit Fraction
bits 0‐22bits 23‐30bit 31
MIPS: Microprocessor without Interlocked Pipeline Stages
![Page 5: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/5.jpg)
Dr. Sudip Roy 5
MIPS Double Precision:
‐1023 ≤ E ≤ 1024, Max |E| ~ 1024 Overflow: Exponent requiring more than 11 bits. Number can be positive or negative.
Underflow: Fraction requiring more than 52 bits. Number can be positive or negative.
S E: 11‐bit Exponent F: 52‐bit Fraction +
bits 32‐51bits 52‐62bit 63
Continuation of 52‐bit Fraction
bits 0‐31
![Page 6: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/6.jpg)
Alternative Representation of 4‐bit Integers:
Dr. Sudip Roy 6
![Page 7: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/7.jpg)
Dr. Sudip Roy 7
IEEE 754 Floating Point Standard:
Biased exponent: true exponent range [‐127,128] is changed to [0, 255]: Biased exponent is an 8‐bit positive binary integer. True exponent obtained by subtracting 127ten or 01111111two
First bit of significand is always 1:± 1.bbbb . . . b × 2E
1 before the binary point is implicitly assumed. Bias = 2(k‐1) – 1, in general
Significand field represents 23 bit fraction after the binary point. Significand range is [1, 2), to be exact [1, 2 – 2‐23] True exponent = biased exponent – 127, for 32‐bit representation
![Page 8: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/8.jpg)
Dr. Sudip Roy 8
IEEE 754 Floating Point Standard:
![Page 9: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/9.jpg)
Dr. Sudip Roy 9
Conversion to Decimal:
Sign bit is 1, number is negative Biased exponent is 27+20 = 129 The number is
1 10000001 01000000000000000000000
Sign bit S bits 23-30 bits 0-22normalized E F
(-1)S × (1 + F) × 2(exponent – bias) = (-1)1 × (1 + F) × 2(129 – 127)
= - 1 × 1.25 × 22
= - 1.25 × 4= - 5.0
![Page 10: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/10.jpg)
Dr. Sudip Roy 10
Positive Zero in IEEE 754:
+ 1.0 × 2‐127
Smallest positive number in single‐precision IEEE 754 standard. Interpreted as positive zero True exponent less than ‐126 is positive underflow; can be regarded as
zero.
0 00000000 00000000000000000000000
Biasedexponent
Fraction
![Page 11: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/11.jpg)
Dr. Sudip Roy 11
Negative Zero in IEEE 754:
‐ 1.0 × 2‐127
Smallest negative number in single‐precision IEEE 754 standard. Interpreted as negative zero. True exponent less than ‐126 is negative underflow; may be regarded as 0.
1 00000000 00000000000000000000000
Biasedexponent
Fraction
![Page 12: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/12.jpg)
Dr. Sudip Roy 12
Positive Infinity in IEEE 754:
+ 1.0 × 2128 Largest positive number in single‐precision IEEE 754 standard. Interpreted as +∞ If true exponent = 128 and frac on ≠ 0, then the number is greater than ∞. It is called “not a number” or NaN and may be interpreted as ∞
0 11111111 00000000000000000000000Biasedexponent
Fraction
![Page 13: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/13.jpg)
Dr. Sudip Roy 13
Negative Infinity in IEEE 754:
‐1.0 × 2128 Smallest negative number in single‐precision IEEE 754 standard. Interpreted as ‐∞ If true exponent = 128 and frac on ≠ 0, then the number is less than ‐ ∞. It is called “not a number” or NaN and may be interpreted as ‐ ∞
1 11111111 00000000000000000000000Biasedexponent
Fraction
![Page 14: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/14.jpg)
Dr. Sudip Roy 14
IEEE 754 Floating Point Standard:
NegativeOverflow
PositiveOverflow
Expressible negativenumbers
Expressible positivenumbers
0‐2‐126 2‐126
Positive underflowNegative underflow
(2 – 2‐23)×2127‐ (2 – 2‐23)×2127
+ ∞–∞
![Page 15: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/15.jpg)
Dr. Sudip Roy 15
Floating Point Arithmetic:
![Page 16: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/16.jpg)
Dr. Sudip Roy 16
Floating Point Addition and Subtraction:
0. Zero check‐ Change the sign of subtrahend, i.e., convert to
summation‐ If either operand is 0, the other is the result
1. Significand alignment: right shift significand of smaller exponent until two exponents match.
2. Addition: add significands and report exception if overflow occurs. If significand = 0, return result as 0.
3. Normalization‐ Shift significand bits to normalize.‐ report overflow or underflow if exponent goes out of
range.4. Rounding
![Page 17: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/17.jpg)
Dr. Sudip Roy 17
Example (4 Significant Fraction Bits):
Subtraction: 0.5ten – 0.4375ten Step 0: Floating point numbers to be added
1.000two× 2 –1 and –1.110two× 2 –2
Step 1: Significand of lesser exponent is shifted right until exponents match
–1.110two× 2 –2 → – 0.111two× 2 –1
Step 2: Add significands, 1.000two + ( – 0.111two)Result is 0.001two × 2 –1
01000+1100100001
2’s complement addition, one bit added for sign
![Page 18: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/18.jpg)
Dr. Sudip Roy 18
Example (Continued):
Step 3: Normalize, 1.000two× 2 – 4
No overflow/underflow since127 ≥ exponent ≥ –126
Step 4: Rounding, no change since the sum fits in 4 bits.1.000two × 2 – 4 = (1+0)/16 = 0.0625ten
![Page 19: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/19.jpg)
Dr. Sudip Roy 19
Floating Point Multiplication (Basic Idea):
1. Separate sign2. Add exponents3. Multiply significands4. Normalize, round, check overflow/underflow5. Replace sign
![Page 20: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/20.jpg)
Dr. Sudip Roy 20
Floating Point Multiplication (Example):
Multiply 0.5ten and – 0.4375ten(answer = – 0.21875ten) or
Multiply 1.000two×2 –1 and –1.110two×2 –2 Step 1: Add exponents
–1 + (–2) = – 3 Step 2: Multiply significands
1.000×1.11000001000100010001110000 Product is 1.110000
![Page 21: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/21.jpg)
Dr. Sudip Roy 21
Floating Point Multiplication (Example):
Step 3: Normalization: If necessary, shift significand right and increment exponent.
Normalized product is 1.110000 × 2 –3
Check overflow/underflow: 127 ≥ exponent ≥ –126 Step 4: Rounding: 1.110 × 2 –3
Step 5: Sign: Operands have opposite signs,Product is –1.110 × 2 –3
(Decimal value = – (1+0.5+0.25)/8 = – 0.21875ten)
![Page 22: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/22.jpg)
Dr. Sudip Roy 22
Floating Point Addition and Subtraction Flowchart:
![Page 23: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/23.jpg)
Dr. Sudip Roy 23
Floating Point Multiplication Flowchart:
![Page 24: CSN221_Lec_10.pdf](https://reader031.vdocuments.us/reader031/viewer/2022020721/563db9c7550346aa9a9fe1b5/html5/thumbnails/24.jpg)
That’s all from Computer Arithmetic !
Dr. Sudip Roy 24