http:// proglit.com

37
http:// proglit.com/

Upload: vienna

Post on 24-Feb-2016

94 views

Category:

Documents


0 download

DESCRIPTION

http:// proglit.com /. bits and text. SA. BY. byte. (the size of a cell of addressable memory) 8 bits on all modern systems octet = 8 bits. kilo byte. 1,000 ( 10 3 ) bytes or 1,024 ( 2 1 0 ) bytes. mega byte. 1,000,000 ( 10 6 ) bytes or 1,048,576 ( 2 2 0 ) bytes. giga byte. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: http:// proglit.com

http://proglit.com/

Page 2: http:// proglit.com

bits and text

Page 3: http:// proglit.com

BY

SA

Page 4: http:// proglit.com
Page 5: http:// proglit.com
Page 6: http:// proglit.com
Page 7: http:// proglit.com

byte(the size of a cell of

addressable memory)

8 bits on all modern systemsoctet = 8 bits

Page 8: http:// proglit.com

kilobyte1,000 (103) bytes

or 1,024 (210) bytes

Page 9: http:// proglit.com

megabyte1,000,000 (106) bytes

or 1,048,576 (220) bytes

Page 10: http:// proglit.com

gigabyte1,000,000,000 (109) bytes

or 1,073,741,824 (230) bytes

Page 11: http:// proglit.com

terabytepetabyteexabytezettabyte (1021 bytes or 270 bytes)

(1018 bytes or 260 bytes)

(1015 bytes or 250 bytes)

(1012 bytes or 240 bytes)

Page 12: http:// proglit.com

kibibyte

mebibyte

gibibyte (230 bytes)

(220 bytes)

(210 bytes)

Page 13: http:// proglit.com

kilobitmegabitgigabit

etc…

(109 bits or 230 bits)

(106 bits or 220 bits)

(103 bits or 210 bits)

Page 14: http:// proglit.com

kilobit (kb)

kilobyte (kB)

Page 15: http:// proglit.com

?

Page 16: http:// proglit.com

“banana”

b a n a n a

2 1 14 1 14 1

Page 17: http:// proglit.com

2 1 14 1 14 1

2 1 14 1 14 1

b a n a n a

Page 18: http:// proglit.com

“banana”

b a n a n a

52 97 4 97 4 97

Page 19: http:// proglit.com

character set(a mapping of characters to numbers)

ASCII(American Standard Code for

Information Interchange)128 characters

Page 20: http:// proglit.com

whitespace character(a character representing spacing)

Page 21: http:// proglit.com

“A banana”

A b a n a n a

65 32 97 96 110 96 110 96

Page 22: http:// proglit.com

whitespace character(a character representing spacing)

space, tab, linefeed, carriage return

Page 23: http:// proglit.com

control character(signals an action response to the reader)

• LF (line feed)• CR (carriage return)• FF (form feed)• BEL (bell)

Page 24: http:// proglit.com

plain text (no formatting, only characters)

• no italics, underline, or bold• no fonts, font sizes, or colors• no margins, columns, or page breaks etc.

Page 25: http:// proglit.com

character(a unit of written language and notation)

glyph(an actual visual representation

of a character)

j j

Page 26: http:// proglit.com

character encoding(scheme for representing characters as bits)

ASCII = 1 byte per character

c a t

100 97 116

0x64 0x61 0x74

Page 27: http:// proglit.com

Unicode(the world standard character set

and its encodings)

U+0000to

U+10FFFF

Page 28: http:// proglit.com

U+0000 – U+FFFF plane 0, BMP (Basic Multilingual Plane)U+10000 – U+1FFFF plane 1, SMP (Supplementary Multilingual Plane)U+20000 – U+2FFFF plane 2, SIP (Supplementary Ideographic Plane)U+30000 – U+DFFFF planes 3 to 13 currently unassignedU+E0000 – U+EFFFF plane 14, SSP (Supplementary Special-purpose Plane)U+F0000 – U+FFFFF plane 15, PUA (Private Use Area)U+100000 – U+10FFFF plane 16, PUA (Private Use Area)

Page 29: http:// proglit.com

UTF-32(4 bytes per character)

U+3FF01 0000_0000 0000_0011 1111_1111 0000_000100 03 FF 01

U+40077 0000_0000 0000_0100 0000_0000 0111_011100 04 00 77

U+0065 0000_0000 0000_0000 0000_0000 0110_010100 00 00 65

Page 30: http:// proglit.com

UTF-16(2 or 4 bytes per character)

U+0065 0000_0000 0110_010100 65

U+F10F 1111_0001 0000_1111F1 0F

Page 31: http:// proglit.com

1101_10xx xxxx_xxxx 1101_11xx xxxx_xxxx

* (fixed) (plane) (character)

UTF-16(2 or 4 bytes per character)

U+3F010 1101_1000 1011_1100 1101_1100 0001_0000

U+10FF00 1101_1011 1111_1111 1101_1111 0000_0000

U+17711 1101_1000 0001_1101 1101_1111 0001_0001

Page 32: http:// proglit.com

UTF-16(2 or 4 bytes per character)

U+3F010 1101_1000 1011_1100 1101_1100 0001_0000D8 BC DC 10

U+10FF00 1101_1011 1111_1111 1101_1111 0000_0000 DB FF DF 00U+17711 1101_1000 0001_1101 1101_1111 0001_0001 D8 1D DF 11

surrogates: U+D800 to U+DFFF

Page 33: http:// proglit.com

UTF-8(1 to 4 bytes per character)

U+0000 – U+007F:0xxx_xxxx

U+0080 – U+07FF:110x_xxxx 10xx_xxxx

U+0800 – U+FFFF:1110_xxxx 10xx_xxxx 10xx_xxxx

U+10000 – U+10FFFF:1111_0xxx 10xx_xxxx 10xx_xxxx 10xx_xxxx

Page 34: http:// proglit.com

UTF-8(1 to 4 bytes per character)

U+0031:0011_0001

U+0700:1101_1100 1000_0000

U+86FF:1110_1000 1001_1011 1011_1111

U+50000:1111_0001 1001_0000 1000_0000 1000_0000

Page 35: http:// proglit.com

UTF-8(1- to 4-bytes per character)

U+0031: (valid) 0011_0001

U+0031: (invalid) 1111_0000 1000_0000 1000_0000 1011_0001

Page 36: http:// proglit.com

text editor(a program for creatingand editing text files)

• notepad• vi/vim• emacs

Page 37: http:// proglit.com

http://proglit.com/