http://proglit.com/. bits and text by sa byte (the size of a cell of addressable memory) 8 bits on...

Download Http://proglit.com/. bits and text BY SA byte (the size of a cell of addressable memory) 8 bits on all modern systems octet = 8 bits

Post on 18-Dec-2015

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • http://proglit.com/
  • Slide 2
  • bits and text
  • Slide 3
  • BY SA
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • byte (the size of a cell of addressable memory) 8 bits on all modern systems octet = 8 bits
  • Slide 8
  • kilobyte 1,000 (10 3 ) bytes or 1,024 (2 10 ) bytes
  • Slide 9
  • megabyte 1,000,000 (10 6 ) bytes or 1,048,576 (2 20 ) bytes
  • Slide 10
  • gigabyte 1,000,000,000 (10 9 ) bytes or 1,073,741,824 (2 30 ) bytes
  • Slide 11
  • terabyte petabyte exabyte zettabyte (10 21 bytes or 2 70 bytes) (10 18 bytes or 2 60 bytes) (10 15 bytes or 2 50 bytes) (10 12 bytes or 2 40 bytes)
  • Slide 12
  • kibibyte mebibyte gibibyte (2 30 bytes) (2 20 bytes) (2 10 bytes)
  • Slide 13
  • kilobit megabit gigabit etc (10 9 bits or 2 30 bits) (10 6 bits or 2 20 bits) (10 3 bits or 2 10 bits)
  • Slide 14
  • kilobit (kb) kilobyte (kB)
  • Slide 15
  • ?
  • Slide 16
  • banana b a n a n a 2 1 14 1 14 1
  • Slide 17
  • b a n a n a
  • Slide 18
  • banana b a n a n a 52 97 4 97 4 97
  • Slide 19
  • character set (a mapping of characters to numbers) ASCII (American Standard Code for Information Interchange) 128 characters
  • Slide 20
  • whitespace character (a character representing spacing)
  • Slide 21
  • A banana A b a n a n a 65 32 97 96 110 96 110 96
  • Slide 22
  • whitespace character (a character representing spacing) space, tab, linefeed, carriage return
  • Slide 23
  • control character (signals an action response to the reader) LF (line feed) CR (carriage return) FF (form feed) BEL (bell)
  • Slide 24
  • plain text (no formatting, only characters) no italics, underline, or bold no fonts, font sizes, or colors no margins, columns, or page breaks etc.
  • Slide 25
  • character (a unit of written language and notation) glyph (an actual visual representation of a character) j
  • Slide 26
  • character encoding (scheme for representing characters as bits) ASCII = 1 byte per character c a t 100 97 116 0x64 0x61 0x74
  • Slide 27
  • Unicode (the world standard character set and its encodings) U+0000 to U+10FFFF
  • Slide 28
  • U+0000 U+FFFF plane 0, BMP (Basic Multilingual Plane) U+10000 U+1FFFF plane 1, SMP (Supplementary Multilingual Plane) U+20000 U+2FFFF plane 2, SIP (Supplementary Ideographic Plane) U+30000 U+DFFFF planes 3 to 13 currently unassigned U+E0000 U+EFFFF plane 14, SSP (Supplementary Special-purpose Plane) U+F0000 U+FFFFF plane 15, PUA (Private Use Area) U+100000 U+10FFFF plane 16, PUA (Private Use Area)
  • Slide 29
  • UTF-32 (4 bytes per character) U+3FF010000_0000 0000_0011 1111_1111 0000_0001 00 03 FF 01 U+400770000_0000 0000_0100 0000_0000 0111_0111 00 04 00 77 U+00650000_0000 0000_0000 0000_0000 0110_0101 00 00 00 65
  • Slide 30
  • UTF-16 (2 or 4 bytes per character) U+00650000_0000 0110_0101 00 65 U+F10F1111_0001 0000_1111 F1 0F
  • Slide 31
  • 1101_10xx xxxx_xxxx 1101_11xx xxxx_xxxx * (fixed) (plane) (character) UTF-16 (2 or 4 bytes per character) U+3F0101101_1000 1011_1100 1101_1100 0001_0000 U+10FF001101_1011 1111_1111 1101_1111 0000_0000 U+177111101_1000 0001_1101 1101_1111 0001_0001
  • Slide 32
  • UTF-16 (2 or 4 bytes per character) U+3F0101101_1000 1011_1100 1101_1100 0001_0000 D8 BC DC 10 U+10FF001101_1011 1111_1111 1101_1111 0000_0000 DB FF DF 00 U+177111101_1000 0001_1101 1101_1111 0001_0001 D8 1D DF 11 surrogates: U+ D800 to U+DFFF
  • Slide 33
  • UTF-8 (1 to 4 bytes per character) U+0000 U+007F: 0xxx_xxxx U+0080 U+07FF: 110x_xxxx 10xx_xxxx U+0800 U+FFFF: 1110_xxxx 10xx_xxxx 10xx_xxxx U+10000 U+10FFFF: 1111_0xxx 10xx_xxxx 10xx_xxxx 10xx_xxxx
  • Slide 34
  • UTF-8 (1 to 4 bytes per character) U+0031: 0011_0001 U+0700: 1101_1100 1000_0000 U+86FF: 1110_1000 1001_1011 1011_1111 U+50000: 1111_0001 1001_0000 1000_0000 1000_0000
  • Slide 35
  • UTF-8 (1- to 4-bytes per character) U+0031:(valid) 0011_0001 U+0031:(invalid) 1111_0000 1000_0000 1000_0000 1011_0001
  • Slide 36
  • text editor (a program for creating and editing text files) notepad vi/vim emacs
  • Slide 37
  • http://proglit.com/