cse 541 – numerical methods machine representation of numbers

CSE 541 – Numerical Methods

Machine Representation of Numbers

April 21, 2023 OSU/CIS 541 2

Blazing Fast

• Computers are extremely fast at performing simple arithmetic.– Count from one to 1,000,000,000,000 by one

- or -– Execute on a Pentium-4 2GHz machine:

int i = 1;While (1) {

printf( “%d/n”, i++ );}

– Which reaches one trillion faster?

April 21, 2023 OSU/CIS 541 3

Blazing Fast

• Assume a human can count on average one number per second (bad assumption).

• Implies 1 trillion seconds

• Or about 32,000 years!!!!

April 21, 2023 OSU/CIS 541 4

Blazing Fast

• Assume a human can count on average one number per second (bad assumption).

• Implies 1 trillion seconds

• Or about 32,000 years!!!!

• Assume that a 2GHz machine can increment by one each clock cycle.

• Implies 500 seconds!!!!

April 21, 2023 OSU/CIS 541 5

Machine Precision

• What is our current National Debt?– (see http://www.brillig.com/debt_clock/)

• What is the annual revenue of a Fortune 100 company?• How many molecules are in one mole of gas (under ideal

assumptions)• How many pixels are displayed in a two hour HDTV

movie?• What is the current balance of Bill Gate’s checking

account?• How many ounces of Pepsi are consumed per year?• How many bytes of storage does the Graphics Group own?

April 21, 2023 OSU/CIS 541 6

Machine Precision

• The point is that we live in a world with a wide range of scales.

• Computers have limited capabilities.

• Let me repeat that: Computers have limited capabilities.

April 21, 2023 OSU/CIS 541 7

Machine Precision

Computers have limited

capabilities!!!

April 21, 2023 OSU/CIS 541 8

Accuracy

• Do we really need that much accuracy?– American Express thinks so– DoD thinks so (see web article on Patriot

missles)

April 21, 2023 OSU/CIS 541 9

Accuracy and Range

• 16-bit integers (common about 5 years ago)216 = 65,536

• 32-bit integers232 = 4,294,967,296

• Only 31-bits if you need signed integers.• 64-bit integers

264 1.84 1019

• Still no Avagadro’s number: 6.022*1023

April 21, 2023 OSU/CIS 541 10

Accuracy and Range

• 16-bit integers (common about 5 years ago)216 = 65,536

• 32-bit integers232 = 4,294,967,296

• Only 31-bits if you need signed integers.• 64-bit integers

264 = 1.84 1019

• Still no Avagadro’s number 6.022*1023

<< 1,000,000,000,000

April 21, 2023 OSU/CIS 541 11

Blazing Fast

• A 2GHZ 32-bit machine may be blazing fast, but it will never reach one trillion either, if care isn’t taken!!!

• Forever >> 32,000 years– Humans win!!!

April 21, 2023 OSU/CIS 541 12

Accuracy and Standards

• What is int i and what is its range?

• ANSI C standard– int – machine dependent– short – machine dependent

Not much of a standard, huh?

April 21, 2023 OSU/CIS 541 13

Standard Range

• Corba– long – 32-bit signed– long long – 64 bit signed

• Java - standardized

April 21, 2023 OSU/CIS 541 14

Advice

• Never use int, long in C / C++• Use the preprocessor to define your own types

#define CISint int \\ I want this to be 32-bit

#define CISshort short \\ 16-bit signed integer

• Hence, if you port to another architecture and need to change it, only change it in one place.

• Likewise, use each packages types: glFloat, …

April 21, 2023 OSU/CIS 541 15

Numbers

• Whole numbers: 0,1,2,…

• Signed integers: …,-1,0,1,2,…

• Rational numbers: a/b, 1/3, etc.

• Irrational numbers: sqrt(2), e, pi, etc.

• Is there a difference between 0.1 and 1/3?

April 21, 2023 OSU/CIS 541 16

Representing Real Numbers

• In Base-10, we can express any number as a sum of weighted power’s of ten.

• Hence:0 1 2 3

2 1 0

1 2

1

1

3.141526 3 10 1 10 4 10 1 10

256 2 10 5 10 6 10

0.75 7 10 5 10

0.1 1 10

13 10

3i

i

April 21, 2023 OSU/CIS 541 17

Representing Real Numbers

• Using Base-2 or binary, what are our series?

• What coefficients can we have?8

1 2

4 5 8

256 1 2

0.75 1 2 1 2

0.1 1 2 1 2 1 2

1?

3

April 21, 2023 OSU/CIS 541 18

Limiting Precision

• Truncate– Truncate to what?

• Truncate to 8 significant digits.

– Okay:3.1415926535897932384626433832795

3.1415926

April 21, 2023 OSU/CIS 541 19

Limiting Precision

• Round to nearest value:3.1415926535897932384626433832795

3.1415927

• Rules:– If value after last retained digit is > 5, then round-up.– If value after last retained digit is < 5, then truncate.– If value after last digit is =5, then:

• If all remaining digits are not zero, round-up.• Else, randomly round-up or truncate.

April 21, 2023 OSU/CIS 541 20

Error

• We need to truncate rational and irrational numbers to make them more manageable.

• Absolute Error:– Typically want the absolute value:

• Relative Error:

• Can I have a very small Absolute error, but a large relative error?

x x

x x x x

x

April 21, 2023 OSU/CIS 541 21

Condition of a Problem

• For a problem with input (data) x and output y, y=F(x). The problem is said to be well-conditioned if small changes in x, lead to small changes in y.

• Otherwise, we say the problem is ill-conditioned.

April 21, 2023 OSU/CIS 541 22

Stability of an Algorithm

• Stability indicates the sensitivity of an algorithm for solving a problem.

• An algorithm is said to be stable if small changes in the input x lead to small changes in the output y.

• Otherwise, the algorithm is said to be unstable.

April 21, 2023 OSU/CIS 541 23

Condition and Stability

• Condition => data• Stability => algorithm

• Ill-conditioned – very hard to get a good result with even the best algorithm.

• Stable – given good data (not ill-conditioned), the algorithm will not yield drastically different results if round-off or small noise is added to the input data.

April 21, 2023 OSU/CIS 541 24

Precision

x = .256834*105

• The digit 2 is the most significant digit, while 4 is the least significant digit.

• Precision should not be confused with accuracy. Accuracy is how close your solution is to the actual solution. Missed the bulls-eye by 2 inches.

• Precision is how good your estimate of the accuracy is, 2.0”±0.003

April 21, 2023 OSU/CIS 541 25

Precision

• If you miss the bulls-eye, but repeatedly hit the same location, then it is precise, but inaccurate.

• For instance, using Taylor’s series for the approximation of sin(x), I can obtain a precise number for values of x larger than 2, but the solution would be inaccurate.

• In a computer, precision is usually how many digits of accuracy your machine representation has (number of bits in the mantissa).

April 21, 2023 OSU/CIS 541 26

Loss of Significance

• Consider the error for x-y using 5 decimal digits of precision:

x = .3721448693

y =. 3720214371

x’ = .37214

y’ =. 37202

x’-y’ = .00012

x-y = .0001234322

April 21, 2023 OSU/CIS 541 27

Loss of Significance

• The relative error is:

• However, the relative error of x’ and y’ is only 1.3*10-5.

• We lost 3 significant digits.

2( ) ( ' ' ) .0000034322

3 10.0001234322

x y x y

x y

April 21, 2023 OSU/CIS 541 28

Loss of Precision Theorem

• Examine the precision loss for x-y.• Let x and y be normalized floating-point

numbers, with x > y > 0. If for some positive integers p and q, then at most p and at least q significant binary bits are lost in the subtraction.

• Rule: The closer two numbers, the greater the loss of significance.

2 1 2p qy

x

April 21, 2023 OSU/CIS 541 29

Avoiding loss of Precision

• Use double precision or higher

• Modify the calculations to remove subtraction of numbers close together.

• Consider: as x approaches 0.

• Reorder to remove the subtraction:

2( ) 1 1f x x

2 2

2

2 2

1 1( ) 1 1

1 1 1 1

x xf x x

x x

April 21, 2023 OSU/CIS 541 30

How do they do that?

• Microsoft Windows ships with a simple calculator for its 32-bit operating system.

• What would you expect, if you typed in 296?– or

– 296 = 79,228,162,514,264,337,593,543,950,336

96

96

962 0.33452

1 2

April 21, 2023 OSU/CIS 541 31

Performance .vs. Accuracy

• With a 2.2GHz computer, you can do a lot of number crunching (two ops per clock cycle).

• Real-time applications– Audio must be processed at 4KHz– One trillion time steps of crash simulation

• Decide whether accuracy or performance is more important

• User interactivity (e.g., calculator):– No need to calculate things faster than the user’s fat

fingers can type them.• Use slower, but more accurate algorithms.

April 21, 2023 OSU/CIS 541 32

Integers

• Exact to within machine’s range:– ±2N-1-1

• See the /usr/include/limits.h file

April 21, 2023 OSU/CIS 541 33

Fixed-Point Numbers

• Used to store floating point numbers with a fixed range.

• Say I need values for the surface temperatures of the earth. Restrict the values to the range (-999,999).

• I could use signed integers, but would only have three significant digits.

• I can also represent any temperature with fixed point numbers of the form xxx.xxxxx to allow for eight significant digits.

April 21, 2023 OSU/CIS 541 34

Implementing Fixed-point Arithmetic

• Fixed point is easy to implement in most cases, You simply ignore the decimal point for all operations, use integers for your calculations and then put the decimal point back in when its needed (e.g., for I/O).31.45330 => 31045330 (need to remember to divide by 100,000 later).

April 21, 2023 OSU/CIS 541 35

Floating Point Numbers

• Split into three parts:– Sign– Mantissa– Exponent

• +6.024*1023

April 21, 2023 OSU/CIS 541 36

Floating-Point Numbers

• CPU architects have choices on how many bits to allocate to each component.

• Sign() – 1-bit

• Exponent – n-bits (Determines the range of numbers)

• Mantissa – whatever is left (Determines the precision of the numbers)

April 21, 2023 OSU/CIS 541 37

IEEE Floating Point

• Single-precision:• Sign() – 1-bit (s)

• Exponent – 8-bits (e)

• Mantissa – 23-bits (m)

• Notes:• +0 and –0 are represented differently

• + and - (NaN’s) have special representations

• The mantissa is in normalized form, meaning the first digit is a one. As such, it is not stored and we have 24-bits of precision.

12721 2 (1. )

s e m

April 21, 2023 OSU/CIS 541 38

Limits for Floats

• Look at the C include file float.h#define DBL_DIG 15 /* # of decimal digits of precision */#define DBL_EPSILON 2.2204460492503131e-016 /* smallest such that 1.0+DBL_EPSILON != 1.0 */#define DBL_MANT_DIG 53 /* # of bits in mantissa */#define DBL_MAX 1.7976931348623158e+308 /* max value */#define DBL_MAX_10_EXP 308 /* max decimal exponent */#define DBL_MAX_EXP 1024 /* max binary exponent */#define DBL_MIN 2.2250738585072014e-308 /* min positive value */#define DBL_MIN_10_EXP (-307) /* min decimal exponent */#define DBL_MIN_EXP (-1021) /* min binary exponent */#define _DBL_RADIX 2 /* exponent radix */#define _DBL_ROUNDS 1 /* addition rounding: near */

#define FLT_DIG 6 /* # of decimal digits of precision */#define FLT_EPSILON 1.192092896e-07F /* smallest such that 1.0+FLT_EPSILON != 1.0 */#define FLT_GUARD 0#define FLT_MANT_DIG 24 /* # of bits in mantissa */#define FLT_MAX 3.402823466e+38F /* max value */#define FLT_MAX_10_EXP 38 /* max decimal exponent */#define FLT_MAX_EXP 128 /* max binary exponent */#define FLT_MIN 1.175494351e-38F /* min positive value */#define FLT_MIN_10_EXP (-37) /* min decimal exponent */#define FLT_MIN_EXP (-125) /* min binary exponent */

April 21, 2023 OSU/CIS 541 39

Divide by zero

• What would happen when you ran this program?

float x = 0.230134789;

float y = 0.230134788;

float inverse;

inverse = 1.0 / (x-y);

return 0;

April 21, 2023 OSU/CIS 541 40

Resolve and Real Numbers

• One of the beautiful benefits of C++ is the ability to redefine or overload functions in a class hierarchy.

• With the CSE department’s Resolve language, you used the type, Real.

• This class (as well as Resolve) is used to ensure that you do not divide by zero.

• Therefore, it overloads the operator”/” method, performs a quick (???) check and then performs the divide (if safe).

April 21, 2023 OSU/CIS 541 41

Designing An Extended Precision Class

• C++ (and other object-oriented languages like Java), allow most of the basic language to be overloaded.

• You can use this feature to define new arithmetic for:– limiting numbers: 0.0 -> 1.0, 32-63, …– embed debugging information: “print if value=pi – Keep track of the error associated with the operation.

This is called Interval Analysis.• Let’s look at designing an extended precision

class.

April 21, 2023 OSU/CIS 541 42

Extended Precision with C++

• Let’s say we want over 1000-bits of precision for our application.

• Furthermore, let’s assume our numbers encode temperatures and only range from -273° to 65,000 ° C.

• Key questions:– What should our basic number format be?– What operations are we supporting?– How do we implement those operations?

April 21, 2023 OSU/CIS 541 43

Decimal Type

• C# has a new decimal type:– 128-bit format– Base-10 arithmetic!– 28 significant figures

• That should handle Bill Gates check-book!!!

– Fixed-point, floating point or integer?– Bit layout?

April 21, 2023 OSU/CIS 541 44

Other C# features

• Keywords checked or unchecked.– Controls whether to throw an OverflowException or

not.

– For integer arithmetic only.– int wontThrow = unchecked( System.Int32.MaxValue + 1);

– int willThrow = checked( System.Int32.MaxValue + 1);

– Can also be used in blocks or as a compiler option.

• Many exceptions can be thrown and caught.• Implicit type casting only happens when widening.

cse 541 – numerical methods machine representation of numbers

Documents