data representation kieran mathieson. outline digital constraints data types integer real character...

32
Data Representation Kieran Mathieson

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Data Representation

Kieran Mathieson

Page 2: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Outline Digital constraints Data types

Integer Real Character Boolean Memory address

Page 3: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints All we have to work with are electronic

components Easier to build accurate digital circuits than

analog Encode data in ways that can be

implemented using cheap electronic components

Page 4: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

NOT gate

Page 5: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

OR/NOR gate

Page 6: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

NAND gate

Page 7: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

An Adder

Page 8: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

A Memory Cell (1 bit)

Page 9: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints

Page 10: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Digital Constraints Binary data – 1 and 0 Fixed number of binary places

Page 11: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Outline Digital constraints Data types

Integer Real Character Boolean Memory address

Page 12: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Data Types1. Integer

2. Real number

3. Character

4. Boolean

5. Memory address

Page 13: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Integers An integer is a whole number (For

example: 3, 5, 6) Integers can be signed or unsigned A signed integer uses one bit to represent

the sign The sign bit is the high order bit

Page 14: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Integers

Page 15: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Range and Overflow If data is too large to store in the 32 or 64

bits, then overflow occurs Overflow is treated as an error by the CPU To avoid overflow some computers and

programming languages define additional data types as double precision (long integer)

Page 16: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Floating Point

Page 17: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Floating Point (IEEE Format)

Issues: range, overflow, underflow, precision, truncation

Page 18: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Characters Mapping from a glyph to a number

Page 19: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Characters The most common in computing is ASCII

Has 127 characters Need 7 bits to represent ASCII characters * = 42, 0 = 48, A = 65 Low numbers reserved for control characters Some national variants of ASCII

US version is often called US-ASCII

Page 20: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Characters

Page 21: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Characters

Page 22: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

ISO Latin 1 8-bit code First 127 values same as ASCII Values 128-256 used for other characters  ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ

¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Page 23: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Unicode Multilingual character encoding standard

encompassing all of the world’s written languages.

Characters are coded using 16 bit strings. About 40,000 characters are represented.

Page 24: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

UTF-8 Unicode characters occur with different

frequencies Spaces are common Arabic characters are relatively uncommon

Represent common Unicode characters using one byte

Represent uncommon ones using 3 or 4 bytes

Page 25: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

UTF-8 Preserves ASCII characters

 

Character UTF-8 UTF-16 UTF-32

small a a 61

Greek pi π CF 80 3C0

Hindi ka E0 A4 95 905

sup. PU F4 8F BF BD DBFF DFFD 10FFFD

Page 26: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Who Cares? Different software uses different default

character sets. Need to specify a character set if you want

to ensure that characters display correctly. Windows uses CP-1252 by default Files can contain character set information

Page 27: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Browsers<meta http-equiv="Content-Type"

content="text/html; charset=windows-1252">

From HTTP 1.1 specification:When no explicit charset parameter is provided by the sender, media subtypes of the text type are defined to have a default charset value of  ISO-8859-1 when received via HTTP. Data in character sets other than ISO-8859-1 or its subsets MUST be labelled with an appropriate charset value.

Page 28: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

UTF-8 Again Many people are recommending UTF-8,

since it is compact but can still represent lots of characters.

Client support will be spotty for years. To test a client, go to:

http://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html

Page 29: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Outline Digital constraints Data types

Integer Real Character Boolean Memory address

Page 30: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Boolean True/false Can use one bit in theory

But in practice computers do not fetch a byte at a time from memory

In loosely-typed languages, sometimes 0 is interpreted as false and anything else as true

Page 31: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Memory Addresses Represents an address in memory

A variable with an address is often called a pointer

Number of bytes needed for an address depends on how many address bits the CPU has (address space) Z80 - 64K address space - 16 bit pointers Intel 8086 - 1M address space - 20 bit

pointers

Page 32: Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address

Outline Digital constraints Data types

Integer Real Character Boolean Memory address