1 lecture 5 floating point numbers itec 1000 “introduction to information technology”

1

Lecture 5Lecture 5

Floating Point NumbersFloating Point Numbers

ITEC 1000 “Introduction to Information Technology”

2

Lecture Template:Lecture Template:

Floating Point NumbersFloating Point Numbers Exponential NotationExponential Notation Excess-50 NotationExcess-50 Notation Overflow and UnderflowOverflow and Underflow Floating Point CalculationsFloating Point Calculations Normalization in Floating PointNormalization in Floating Point IEEE 754 Standard IEEE 754 Standard Packed Decimal FormatPacked Decimal Format Programming ConsiderationsProgramming Considerations

3

Floating Point NumbersFloating Point Numbers

Real numbers Used in computer when the number

is outside the integer range of the computer (too large or too small)contains a decimal fractionthe range in PC’s:

ror more

3838 10 number 10

4

Exponential NotationExponential Notation

The representations differ in that the decimal place – the “point” -- “floats” to the left or right (with the appropriate adjustment in the exponent).

The following are equivalent representations of 1,234

123,400.0 x 10-2

12,340.0 x 10-1

1,234.0 x 100

123.4 x 101

12.34 x 102

1.234 x 103

0.1234 x 104

5

Exponential NotationExponential Notation

Also called scientific notation

4 specifications required for a number1. Sign (“+” in example)2. Magnitude or mantissa (12345)3. Sign of the exponent (“+” in 105)4. Magnitude of the exponent (5)

Plus5. Base of the exponent (10)6. Location of decimal point (or other base) radix point

12345 12345 x 100

0.12345 x 105 123450000 x 10-4

6

Parts of a Floating Point Number Parts of a Floating Point Number

-0.9876 x 10-3

Sign ofmantissa

Location ofdecimal point Mantissa

Exponent

Sign ofexponent

Base

7

Floating Point Format SpecificationFloating Point Format Specification

Integer format (8-bit word)7 decimal digits and a signRange: -9,999,999 < I < +9,999,999

Floating point format (8-bit word)

Sign of the mantissa

SEEMMMMM

2-digit Exponent 5-digit Mantissa

8

FormatFormat

Mantissa: stored in sign-magnitude format Assume decimal point located at the beginning

of mantissa Exponent stored in Excess-N notation:

Complementary notation Pick middle value as offset where N is the middle value: 0..99 e.g., excess-50

Representation 0 49 50 99

Exponent being represented

-50 -1 0 49

– Increasing magnitude +

9

Excess-50 notationExcess-50 notation

Excess-N representation: R = N + EE Example1: N = 50, EE = 38, R = 88 Example2: N = 50, EE = -38, R = 12 Excess-50: Magnitude range

4950 10 x 0.99999 number 10 x 00001.0

10

Overflow and UnderflowOverflow and Underflow

Possible for the number to be too large or too small for representation

0.00001 x 10-50 = 10-55

11

Floating Point Format: Floating Point Format: Excess-50 Excess-50

First digit represents the sign of mantissa

0 is used as a “+“sign5 is used as a “-“sign (arbitrarily)

Two next digits represent exponent in excess-50

Five last digits represent mantissa fixed decimal point located at the beginning

12

ExamplesExamples

05324567 = 0.24567 x 103 = 246.57

54810000 = – 0.10000 X 10-2 = – 0.0010000

5555555 = – 0.55555 x 105 = – 55555

04925000 = 0.25000 x 10-1 = 0.025000

13

NormalizationNormalization

Shift numbers left by increasing the exponent until leading zeros eliminated

Converting decimal number into standard format1. Provide number with exponent (0 if not yet

specified)2. Increase/decrease exponent to shift decimal

point to proper position3. Decrease exponent to eliminate leading zeros

on mantissa4. Correct precision by adding 0’s or

discarding/rounding least significant digits

14

Example 1: 246.8035Example 1: 246.8035

1. Add exponent 246.8035 x 100

2. Position decimal point .2468035 x 103

3. Already normalized

4. Cut to 5 digits .24680 x 103

5. Convert number 05324680

Excess-50 exponentExcess-50 exponent MantissaMantissa

SignSign

15

Example 2: 1255 x 10Example 2: 1255 x 10-3-3

1. Already in exponential form

1255x 10-3

2. Position decimal point 0.1255 x 10+1

3. Already normalized

4. Add 0 for 5 digits 0.12550 x 10+1


16

Example 3: - 0.00000075Example 3: - 0.00000075

1. Exponential notation - 0.00000075 x 100

2. Decimal point in position

3. Normalizing - 0.75 x 10-6

4. Add 0 for 5 digits - 0.75000 x 10-6


17

Floating Point CalculationsFloating Point Calculations

Addition and subtractionExponent and mantissa treated separatelyExponents of numbers must agree

Align decimal pointsLeast significant digits may be lost

Mantissa overflow requires exponent again shifted right

18

ExampleExample

Add 2 floating point numbers 05199520+ 04967850

Align exponents 051995200510067850

Add mantissas; (1) indicates a carry (1)0019850

Carry requires right shift 05210019(850)

Round 05210020

Check results

05199520 = 0.99520 x 101 = 9.9520

04967850 = 0.67850 x 10-1 = 0.06785

= 10.01985

In exponential form = 0.1001985 x 102

Precision lost

19

Multiplication and DivisionMultiplication and Division

Mantissas: multiplied or divided Exponents: added or subtracted

Normalization necessary to Restore location of decimal pointMaintain precision of the result

Adjust excess value since added twiceExample: 2 numbers with exponent = 53 represented in excess-50 notation53 + 53 =106Since 50 added twice, subtract: 106 – 50 =56

Maintaining precision: Normalizing and rounding multiplication

20

ExampleExample

Multiply 2 numbers05220000

x 04712500

Add exponents, subtract offset 52 + 47 – 50 = 49

Multiply mantissas 0.20000 x 0.12500 = 0.025000000

Normalize the results 04825000

Check results

05220000 = 0.20000 x 102

04712500 = 0.125 x 10-3

= 0.0250000000 x 10-1

Normalizing and rounding

=

0.25000 x 10-2

21

Floating Point in the ComputerFloating Point in the Computer

Replace digits with “0” and “1” bits Typical floating point format

32 bits provide range ~10-38 to 10+38

8-bit exponent = 256 levelsExcess-128 notation

23 bits of mantissa: approximately 7 decimal digits of precision

22

IEEE 754 StandardIEEE 754 Standard

Most common standard for representing floating point numbers

Single precision: 32 bits, consisting of...Sign bit (1 bit)Exponent (8 bits)

Mantissa (23 bits)

Double precision: 64 bits, consisting of…Sign bit (1 bit)Exponent (11 bits)Mantissa (52 bits)

23

Single Precision FormatSingle Precision Format

32 bits

Mantissa (23 bits)

Exponent (8 bits)

Sign of mantissa (1 bit)

24

Double Precision FormatDouble Precision Format

64 bits

Mantissa (52 bits)

Exponent (11 bits)

Sign of mantissa (1 bit)

25


Precision Single (32 bit)

Double (64 bit)

Sign 1 bit 1 bit

Exponent 8 bits 11 bits

Notation Excess-127 Excess-1023

Implied base 2 2

Range 2-126 to 2127 2-1022 to 21023

Mantissa 23 52

Decimal digits 7 15

Value range 10-45 to 1038 10-300 to 10300

26


32-bit Floating Point Value Definition

Exponent Mantissa Value

0 ±0 0

0 Not 0 ±2-126 x 0.M

1-254 Any ±2-127 x 1.M

255 ±0 ±

255 not 0 special condition

27

Normalization in Floating PointNormalization in Floating Point

Mantissa:Must always start with “1”Leading bit is not storedImplied that it is located to the left of the binary pointNormalized Form: 1.MMMMMMM…

E.g.:

Mantissa:Actual value:

ExponentFormatted using Excess-127 notationBase 2 is impliedRange: 2-126 to 2127

10100000000000000000000

1.1012 = 1.62510

28

Excess Notation: ExampleExcess Notation: Example

Represent exponent of 1410 in excess-127 form:

12710 = + 011111112

1410 = + 000011102

Representation = 100011012

14110

29

Excess Notation: ExampleExcess Notation: Example

Represent exponent of -810 in excess 127 form:

12710 = + 011111112

- 810 = - 000010002

Representation = 011101112

11910

30

Single Precision: ExampleSingle Precision: Example

0 10000010 11000000000000000000000

1.112 = 1.7510

130 – 127 = 3

0 = positive mantissa

+1.75 23 = 14.0

or +1.112 23 = +1110.0 =14

31

Single Precision: Exercise Single Precision: Exercise

What decimal value is represented by the following 32-bit floating point number?

Answer:

1 10000010 11110110000000000000000

Skip answer Answer

32

Single Precision: ExerciseSingle Precision: ExerciseAnswer

What decimal value is represented by the following 32-bit floating point number?

Answer: -15.6875

1 10000010 11110110000000000000000

33

Step by Step SolutionStep by Step Solution

1 10000010 11110110000000000000000

To decimal form

130 - 127 = 3 1.11110110000000000000000000

1 + .5 + .25 + .125 + .0625 + 0 + .015625 + .0078125

1.960937523 * = 15.6875

- 15.6875( negative )

34

Step by Step Solution : Step by Step Solution : Alternative MethodAlternative Method

1 10000010 11110110000000000000000

To decimal form

130 - 127 = 3 1.11110110000000000000000000

1111.10110000000000000000000

- 15.6875( negative )

Shift “Point”

35

IBM floating point formatsIBM floating point formats

36

Alpha floating point formatsAlpha floating point formats

37

Exercise: Floating Point ConversionExercise: Floating Point Conversion

Express 3.14 as a 32-bit floating point number

Answer:(Note: only use 10 significant bits for the mantissa)

Skip answer Answer

38

Exercise: Floating Point ConversionExercise: Floating Point ConversionAnswer

Express 3.14 as a 32-bit floating point number

Answer:(Note: only use 10 significant bits for the mantissa)

0 10000000 10010001111000000000000

39

Detail Solution : 3.14 to IEEE Detail Solution : 3.14 to IEEE double precisiondouble precision

3.14 To Binary (approx): 11.001000111101

Delete implied left-most “1”and normalize

1001000111101

Exponent = 127 + 1 positionpoint moved when normalized

10000000

Value is positive: Sign bit = 0

0 10000000 10010001111010000000000

Prove !

40

Packed Decimal FormatPacked Decimal Format

Limited use: e.g: where precision particularly important, as in accounting and business functions.

Similar to BCD: e.g: four bit representation, as in BCD.

-> Stores two digits per byte.Supported by business-oriented languages

like COBOLImplemented in IBM System 370/390 and

Compaq Alpha

41

Packed Decimal FormatPacked Decimal Format

Each decimal digit is stored in BCDTwo digits in a byte

The most significant digit – stored first, in the high-order bits of the first byte

Can store up to 31 digits in 16 bytes The sign is stored in the low-order bits of the last byte

Binary 1100 represents “+”Binary 1101 represents “-”Binary 1111 represents unsigned number

Decimal point not stored: must be maintained by application software

42

Packed Decimal Format: Example 1Packed Decimal Format: Example 1

Decimal Value: 1 0 3 5 7, unsigned

Packed Decimal: 0001 0000 0011 0101 0111 1111 Byte 1 Byte 2 Byte 3

43

Packed Decimal Format: Example 2Packed Decimal Format: Example 2

Decimal Value: - 9 0 4 1 3

Packed Decimal: 1001 0000 0100 0001 0011 1101 Byte 1 Byte 2 Byte 3

44

Integer vs. Floating Point: Integer vs. Floating Point: Programming ConsiderationsProgramming Considerations

Integer advantagesEasier for computer to performPotential for higher precisionFaster to executeFewer storage locations to save time and space

Most high-level languages provide 2 or more different integer word sizes/formats:

Short integer (16 bits) Long integer (64 bits)

45

Integer vs. Floating Point: Integer vs. Floating Point: Programming ConsiderationsProgramming Considerations

Real numbers, if: Variable or constant has fractional part Numbers take on very large or very

small values outside integer range Program should use least precision

sufficient for the task Higher precision formats require more

storage Packed decimal attractive alternative for

business applications

46

Computer humour

1 lecture 5 floating point numbers itec 1000 “introduction to information technology”

Documents