week 4 number systems. large numbers in metric bytes the bits in a computer are grouped into larger...

24
Week 4 Number Systems

Upload: preston-marshall

Post on 27-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Week 4

Number Systems

Large numbers in metric

Bytes

The bits in a computer are grouped into larger units.

A group of eight bits is called a byte. A byte can be used to store 2^8 values - 256

different values. These values can be numeric values or alphanumeric values.

Question

Large Numbers

ASCII

When bytes are used to hold characters a code must be used to represent which numeric value will represent which character.

The most common code is the American Standard Code for Information Interchange(ASCII).

One byte can hold one character.

ASCII (Contd)

The letters of the alphabet and the ten characters 0 to 9 are called alphanumeric

When there are a number of characters stored together they are called strings of characters.

In ASCII the code for the letter A is 6510 or 4116.

In ASCII the code for the string of Alphanumeric Characters 36 is the ASCII code for 3 followed by the ASCII code for 6.

ASCII (Contd)

The ASCII code for 3 is 5110 or 3316

The ASCII code for 6 is 5410 or 3616 Therefore it takes two bytes of memory to

store the alphanumeric string 36, one byte for each character. Contrast this with the numeric value for 3610 which is 2416

Contd

When a byte is used to represent a number, the 256 different byte values can either be interpreted as all positive numbers ranging from 0 to 25510.

To allow for negative numbers, the possible range can be divided in half to allow for representation of negative and positive numbers ranging from -12810 through +12710.

Contd

All positive are referred to as unsigned numbers

0 to 25510

Positive and Negative are referred to as signed

-12810 to + 12710

A typical 1 byte register in a computer may look like the one below.

Contd If the number is unsigned then the largest

number that can be represented is 25510 or 28 -1

Why ? There are 256 possible values in the range 0 to 255.

If the number is signed then bit 8, (2^7), the most significant bit, is reserved as the sign bit.

Contd The sign bit tells whether the number is (+),

positive, or whether the number is (-), negative.

A “0” in the sign bit means that the number is positive.

A “1” in the sign bit means that the number is negative.

WORD

To handle larger numbers, several bytes are used together as a unit, often called a word.

For different computers, different meanings are given to the term word but it could represent either 2 bytes (16 bits) or four bytes (32 bits) or 8 bytes (64 bits).

Word (Contd)

A two byte word has 2^16 different possible values.

The 16 bit word can be used to store unsigned numbers with a range through 0 to 65,53510.

Therefore the largest unsigned number available using 2 bytes is (2^16 -1).

Extended ASCII

8 Bit code also known as Latin 1. Mostly Latin letters and accents and diacritical

marks used for example in French and German.

Problem: Russian and Arabic have totally different alphabets.

Code Page

A set of 256 characters for a particular language or group of languages.

EG. International standard (I.S) 8859-2 handles Latin based Slavic Languages- Czech, Polish and Hungarian.

E.G. International standard (I.S) 8859-3 handles the characters needed For Turkish , Maltese, Esperanto and Galician.

Code Page

Problem: it is impossible to mix languages on different pages.

### Does not cover Japanese or Chinese.

UNICODE

Unicode is replacing ASCII as the coding standard for characters.

The basic idea to assign every character and symbol a unique 16 bit value called a code point. If you have a 16 bit code how many symbols can you represent?

Contd

As the world's languages contain roughly 200,000 symbols, code points are a scarce resource. The resource must be carefully guarded and assigned with care.

Problem: Chinese and Japanese have fewer code points

allocated than symbols needed. Many are not happy with compromises that have been made.

ASCII Problem

• While computers can represent 256 names for letters, no one agrees on what letters the numbers 128-256 stand for.

• That’s why your Mac Cayuga font shows up as gibberish on a Windows computer.– The number ‘250’ doesn’t mean the same thing

on a Mac as it does on a Windows PC.

UNICODE, Fixing ASCII Problem

• Unicode aims to provide a unique name for every letter ever used…on the planet.

• It has room for 1,000,000 names.• Everyone agrees on what letters the names

stand for.– Technical details not discussed here: getting from

names like ‘U+0061’ to the letter ‘a’ on your computer.

UNICODE, Advantages

• (Not quite yet, but in the near future) when you type in Cayuga, it will appear as Cayuga on any other computer.

• The same goes for web pages…

Task for you

Unicode version 4 incorporates a 32 bit code version and therefore can now represent all of the world's characters in it's character coding system.

The Unicode URL is: http://www.unicode.org/ If you wish to research this further.

Thank You