cse 541 – numerical methods machine representation of numbers
TRANSCRIPT
CSE 541 – Numerical Methods
Machine Representation of Numbers
April 21, 2023 OSU/CIS 541 2
Blazing Fast
• Computers are extremely fast at performing simple arithmetic.– Count from one to 1,000,000,000,000 by one
- or -– Execute on a Pentium-4 2GHz machine:
int i = 1;While (1) {
printf( “%d/n”, i++ );}
– Which reaches one trillion faster?
April 21, 2023 OSU/CIS 541 3
Blazing Fast
• Assume a human can count on average one number per second (bad assumption).
• Implies 1 trillion seconds
• Or about 32,000 years!!!!
April 21, 2023 OSU/CIS 541 4
Blazing Fast
• Assume a human can count on average one number per second (bad assumption).
• Implies 1 trillion seconds
• Or about 32,000 years!!!!
• Assume that a 2GHz machine can increment by one each clock cycle.
• Implies 500 seconds!!!!
April 21, 2023 OSU/CIS 541 5
Machine Precision
• What is our current National Debt?– (see http://www.brillig.com/debt_clock/)
• What is the annual revenue of a Fortune 100 company?• How many molecules are in one mole of gas (under ideal
assumptions)• How many pixels are displayed in a two hour HDTV
movie?• What is the current balance of Bill Gate’s checking
account?• How many ounces of Pepsi are consumed per year?• How many bytes of storage does the Graphics Group own?
April 21, 2023 OSU/CIS 541 6
Machine Precision
• The point is that we live in a world with a wide range of scales.
• Computers have limited capabilities.
• Let me repeat that: Computers have limited capabilities.
April 21, 2023 OSU/CIS 541 7
Machine Precision
Computers have limited
capabilities!!!
April 21, 2023 OSU/CIS 541 8
Accuracy
• Do we really need that much accuracy?– American Express thinks so– DoD thinks so (see web article on Patriot
missles)
April 21, 2023 OSU/CIS 541 9
Accuracy and Range
• 16-bit integers (common about 5 years ago)216 = 65,536
• 32-bit integers232 = 4,294,967,296
• Only 31-bits if you need signed integers.• 64-bit integers
264 1.84 1019
• Still no Avagadro’s number: 6.022*1023
April 21, 2023 OSU/CIS 541 10
Accuracy and Range
• 16-bit integers (common about 5 years ago)216 = 65,536
• 32-bit integers232 = 4,294,967,296
• Only 31-bits if you need signed integers.• 64-bit integers
264 = 1.84 1019
• Still no Avagadro’s number 6.022*1023
<< 1,000,000,000,000
April 21, 2023 OSU/CIS 541 11
Blazing Fast
• A 2GHZ 32-bit machine may be blazing fast, but it will never reach one trillion either, if care isn’t taken!!!
• Forever >> 32,000 years– Humans win!!!
April 21, 2023 OSU/CIS 541 12
Accuracy and Standards
• What is int i and what is its range?
• ANSI C standard– int – machine dependent– short – machine dependent
Not much of a standard, huh?
April 21, 2023 OSU/CIS 541 13
Standard Range
• Corba– long – 32-bit signed– long long – 64 bit signed
• Java - standardized
April 21, 2023 OSU/CIS 541 14
Advice
• Never use int, long in C / C++• Use the preprocessor to define your own types
#define CISint int \\ I want this to be 32-bit
#define CISshort short \\ 16-bit signed integer
• Hence, if you port to another architecture and need to change it, only change it in one place.
• Likewise, use each packages types: glFloat, …
April 21, 2023 OSU/CIS 541 15
Numbers
• Whole numbers: 0,1,2,…
• Signed integers: …,-1,0,1,2,…
• Rational numbers: a/b, 1/3, etc.
• Irrational numbers: sqrt(2), e, pi, etc.
• Is there a difference between 0.1 and 1/3?
April 21, 2023 OSU/CIS 541 16
Representing Real Numbers
• In Base-10, we can express any number as a sum of weighted power’s of ten.
• Hence:0 1 2 3
2 1 0
1 2
1
1
3.141526 3 10 1 10 4 10 1 10
256 2 10 5 10 6 10
0.75 7 10 5 10
0.1 1 10
13 10
3i
i
April 21, 2023 OSU/CIS 541 17
Representing Real Numbers
• Using Base-2 or binary, what are our series?
• What coefficients can we have?8
1 2
4 5 8
256 1 2
0.75 1 2 1 2
0.1 1 2 1 2 1 2
1?
3
April 21, 2023 OSU/CIS 541 18
Limiting Precision
• Truncate– Truncate to what?
• Truncate to 8 significant digits.
– Okay:3.1415926535897932384626433832795
3.1415926
April 21, 2023 OSU/CIS 541 19
Limiting Precision
• Round to nearest value:3.1415926535897932384626433832795
3.1415927
• Rules:– If value after last retained digit is > 5, then round-up.– If value after last retained digit is < 5, then truncate.– If value after last digit is =5, then:
• If all remaining digits are not zero, round-up.• Else, randomly round-up or truncate.
April 21, 2023 OSU/CIS 541 20
Error
• We need to truncate rational and irrational numbers to make them more manageable.
• Absolute Error:– Typically want the absolute value:
• Relative Error:
• Can I have a very small Absolute error, but a large relative error?
x x
x x x x
x
April 21, 2023 OSU/CIS 541 21
Condition of a Problem
• For a problem with input (data) x and output y, y=F(x). The problem is said to be well-conditioned if small changes in x, lead to small changes in y.
• Otherwise, we say the problem is ill-conditioned.
April 21, 2023 OSU/CIS 541 22
Stability of an Algorithm
• Stability indicates the sensitivity of an algorithm for solving a problem.
• An algorithm is said to be stable if small changes in the input x lead to small changes in the output y.
• Otherwise, the algorithm is said to be unstable.
April 21, 2023 OSU/CIS 541 23
Condition and Stability
• Condition => data• Stability => algorithm
• Ill-conditioned – very hard to get a good result with even the best algorithm.
• Stable – given good data (not ill-conditioned), the algorithm will not yield drastically different results if round-off or small noise is added to the input data.
April 21, 2023 OSU/CIS 541 24
Precision
x = .256834*105
• The digit 2 is the most significant digit, while 4 is the least significant digit.
• Precision should not be confused with accuracy. Accuracy is how close your solution is to the actual solution. Missed the bulls-eye by 2 inches.
• Precision is how good your estimate of the accuracy is, 2.0”±0.003
April 21, 2023 OSU/CIS 541 25
Precision
• If you miss the bulls-eye, but repeatedly hit the same location, then it is precise, but inaccurate.
• For instance, using Taylor’s series for the approximation of sin(x), I can obtain a precise number for values of x larger than 2, but the solution would be inaccurate.
• In a computer, precision is usually how many digits of accuracy your machine representation has (number of bits in the mantissa).
April 21, 2023 OSU/CIS 541 26
Loss of Significance
• Consider the error for x-y using 5 decimal digits of precision:
x = .3721448693
y =. 3720214371
x’ = .37214
y’ =. 37202
x’-y’ = .00012
x-y = .0001234322
April 21, 2023 OSU/CIS 541 27
Loss of Significance
• The relative error is:
• However, the relative error of x’ and y’ is only 1.3*10-5.
• We lost 3 significant digits.
2( ) ( ' ' ) .0000034322
3 10.0001234322
x y x y
x y
April 21, 2023 OSU/CIS 541 28
Loss of Precision Theorem
• Examine the precision loss for x-y.• Let x and y be normalized floating-point
numbers, with x > y > 0. If for some positive integers p and q, then at most p and at least q significant binary bits are lost in the subtraction.
• Rule: The closer two numbers, the greater the loss of significance.
2 1 2p qy
x
April 21, 2023 OSU/CIS 541 29
Avoiding loss of Precision
• Use double precision or higher
• Modify the calculations to remove subtraction of numbers close together.
• Consider: as x approaches 0.
• Reorder to remove the subtraction:
2( ) 1 1f x x
2 2
2
2 2
1 1( ) 1 1
1 1 1 1
x xf x x
x x
April 21, 2023 OSU/CIS 541 30
How do they do that?
• Microsoft Windows ships with a simple calculator for its 32-bit operating system.
• What would you expect, if you typed in 296?– or
– 296 = 79,228,162,514,264,337,593,543,950,336
96
96
962 0.33452
1 2
April 21, 2023 OSU/CIS 541 31
Performance .vs. Accuracy
• With a 2.2GHz computer, you can do a lot of number crunching (two ops per clock cycle).
• Real-time applications– Audio must be processed at 4KHz– One trillion time steps of crash simulation
• Decide whether accuracy or performance is more important
• User interactivity (e.g., calculator):– No need to calculate things faster than the user’s fat
fingers can type them.• Use slower, but more accurate algorithms.
April 21, 2023 OSU/CIS 541 32
Integers
• Exact to within machine’s range:– ±2N-1-1
• See the /usr/include/limits.h file
April 21, 2023 OSU/CIS 541 33
Fixed-Point Numbers
• Used to store floating point numbers with a fixed range.
• Say I need values for the surface temperatures of the earth. Restrict the values to the range (-999,999).
• I could use signed integers, but would only have three significant digits.
• I can also represent any temperature with fixed point numbers of the form xxx.xxxxx to allow for eight significant digits.
April 21, 2023 OSU/CIS 541 34
Implementing Fixed-point Arithmetic
• Fixed point is easy to implement in most cases, You simply ignore the decimal point for all operations, use integers for your calculations and then put the decimal point back in when its needed (e.g., for I/O).31.45330 => 31045330 (need to remember to divide by 100,000 later).
April 21, 2023 OSU/CIS 541 35
Floating Point Numbers
• Split into three parts:– Sign– Mantissa– Exponent
• +6.024*1023
April 21, 2023 OSU/CIS 541 36
Floating-Point Numbers
• CPU architects have choices on how many bits to allocate to each component.
• Sign() – 1-bit
• Exponent – n-bits (Determines the range of numbers)
• Mantissa – whatever is left (Determines the precision of the numbers)
April 21, 2023 OSU/CIS 541 37
IEEE Floating Point
• Single-precision:• Sign() – 1-bit (s)
• Exponent – 8-bits (e)
• Mantissa – 23-bits (m)
• Notes:• +0 and –0 are represented differently
• + and - (NaN’s) have special representations
• The mantissa is in normalized form, meaning the first digit is a one. As such, it is not stored and we have 24-bits of precision.
12721 2 (1. )
s e m
April 21, 2023 OSU/CIS 541 38
Limits for Floats
• Look at the C include file float.h#define DBL_DIG 15 /* # of decimal digits of precision */#define DBL_EPSILON 2.2204460492503131e-016 /* smallest such that 1.0+DBL_EPSILON != 1.0 */#define DBL_MANT_DIG 53 /* # of bits in mantissa */#define DBL_MAX 1.7976931348623158e+308 /* max value */#define DBL_MAX_10_EXP 308 /* max decimal exponent */#define DBL_MAX_EXP 1024 /* max binary exponent */#define DBL_MIN 2.2250738585072014e-308 /* min positive value */#define DBL_MIN_10_EXP (-307) /* min decimal exponent */#define DBL_MIN_EXP (-1021) /* min binary exponent */#define _DBL_RADIX 2 /* exponent radix */#define _DBL_ROUNDS 1 /* addition rounding: near */
#define FLT_DIG 6 /* # of decimal digits of precision */#define FLT_EPSILON 1.192092896e-07F /* smallest such that 1.0+FLT_EPSILON != 1.0 */#define FLT_GUARD 0#define FLT_MANT_DIG 24 /* # of bits in mantissa */#define FLT_MAX 3.402823466e+38F /* max value */#define FLT_MAX_10_EXP 38 /* max decimal exponent */#define FLT_MAX_EXP 128 /* max binary exponent */#define FLT_MIN 1.175494351e-38F /* min positive value */#define FLT_MIN_10_EXP (-37) /* min decimal exponent */#define FLT_MIN_EXP (-125) /* min binary exponent */
April 21, 2023 OSU/CIS 541 39
Divide by zero
• What would happen when you ran this program?
float x = 0.230134789;
float y = 0.230134788;
float inverse;
inverse = 1.0 / (x-y);
return 0;
April 21, 2023 OSU/CIS 541 40
Resolve and Real Numbers
• One of the beautiful benefits of C++ is the ability to redefine or overload functions in a class hierarchy.
• With the CSE department’s Resolve language, you used the type, Real.
• This class (as well as Resolve) is used to ensure that you do not divide by zero.
• Therefore, it overloads the operator”/” method, performs a quick (???) check and then performs the divide (if safe).
April 21, 2023 OSU/CIS 541 41
Designing An Extended Precision Class
• C++ (and other object-oriented languages like Java), allow most of the basic language to be overloaded.
• You can use this feature to define new arithmetic for:– limiting numbers: 0.0 -> 1.0, 32-63, …– embed debugging information: “print if value=pi – Keep track of the error associated with the operation.
This is called Interval Analysis.• Let’s look at designing an extended precision
class.
April 21, 2023 OSU/CIS 541 42
Extended Precision with C++
• Let’s say we want over 1000-bits of precision for our application.
• Furthermore, let’s assume our numbers encode temperatures and only range from -273° to 65,000 ° C.
• Key questions:– What should our basic number format be?– What operations are we supporting?– How do we implement those operations?
April 21, 2023 OSU/CIS 541 43
Decimal Type
• C# has a new decimal type:– 128-bit format– Base-10 arithmetic!– 28 significant figures
• That should handle Bill Gates check-book!!!
– Fixed-point, floating point or integer?– Bit layout?
April 21, 2023 OSU/CIS 541 44
Other C# features
• Keywords checked or unchecked.– Controls whether to throw an OverflowException or
not.
– For integer arithmetic only.– int wontThrow = unchecked( System.Int32.MaxValue + 1);
– int willThrow = checked( System.Int32.MaxValue + 1);
– Can also be used in blocks or as a compiler option.
• Many exceptions can be thrown and caught.• Implicit type casting only happens when widening.