Chapter 8_1
Bits and the "Why" of Bytes:Representing Information Digitally
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-3
Digitizing Discrete Information
• Digitize: Represent information with digits (numerals 0 through 9)
• Limitation of Digits– Alternative Representation: Any set of symbols
could represent phone number digits, as long as the keypad is labeled accordingly
• Symbols, Briefly– Digits have the advantage of having short names
(easy to say)• But computer professionals are shortening symbol
names (exclamation point is pronounced "bang")
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-4
Ordering Symbols
• Advantage of digits for encoding info is that items can be listed in numerical order
• To use other symbols, we need an ordering system (collating sequence)
– Agreed order from smallest to largest value
• In choosing symbols for encoding, consider how symbols interact with things being encoded
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-5
Encoding with Dice
• Consider a representation based on dice
– Single die has six sides, and the patterns on the sides can be used for digital representation
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-6
Encoding with Dice (cont'd)
• Consider representing the Roman alphabet with dice
– 26 letter, only 6 patterns on a die
– We use multiple patterns to represent each letter
– How many patterns are required?• 2 dice can produce 36 pattern sequences (6*6)
• 3 dice can produce 216 (6*6*6)
• n dice can produce 6n sequences
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-7
Encoding with Dice (cont'd)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-8
Encoding with Dice (cont'd)
• Call each pattern produced by pairing two dice a symbol:– 1 and 1 == A– 1 and 2 == B– 1 and 3 == C– 1 and 4 == D– 1 and 5 == E– 1 and 6 == F– 2 and 1 == G– etc.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-9
Encoding with Dice (cont'd)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-10
Extending the Encoding
• 26 letters of the Roman alphabet are represented; 10 spaces are blank
• These spaces can be used for the Arabic numerals
• What is we need punctuation? We only have 36 available spaces in the two-dice system. How can we avoid going to a three-dice system?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-11
The Escape: Creating More Symbols
• We can use the last two-dice symbol as "escape". It does not match any legal character, so it will never be needed for normal text digitization
• It indicates that the digitization is "escaping from the basic representation" and applying a secondary representation
• We could also use a "double escape" to enter a tertiary representation system
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-12
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-13
The Fundamental Representation of Information
• The fundamental patters used in IT come when the physical world meets the logical world
• The most fundamental form of information is the presence or absence of a physical phenomenon
• In the logical world, the concepts or true and false are important
– By associating true with the presence of a phenomenon and false with its absence, we use the physical world to implement the logical world, and produce information technology
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-14
The PandA Representation
• PandA is the mnemonic for "presence and absence"
• It is discrete (distinct or separable)—the phenomenon is present or it is not (true or false). There in no continuous gradation in between
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-15
A Binary System
• Two patterns—Present and Absent—make PandA a binary system
• We can give any names to these two patterns as long as we are consistent
• The PandA basic unit is known as a "bit" (short for binary digit)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-16
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-17
Bits in Computer Memory
• Memory is arranged inside a computer in a very long sequence of bits (places where a phenomenon can be set and detected)
• Analogy: Sidewalk Memory
– Each sidewalk square represents a memory slot, and stones represent the presence or absence
– If a stone is on the square, the value is 1, if not the value is 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-18
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-19
Units of DataMeasures the size or volume of data
BIT: Single binary digit, either a 1 or a 0
BYTE: A group of eight bits, a character (Basic unit for measuring the size of the data)KILOBYTE: 210 or 1024 bytes (not 1000 used in scientific notation for the prefix kilo)MEGABYTE: 220 or 1,048,576 bytes (not 1000000 used in scientific notation for the prefix mega-)
WORD: the number of adjacent bits that are manipulated or stored as a unit in a particular computer system
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-20
Alternative PandA Encodings
• There are other ways to encode two states using physical phenomena
– Use stones on all squares, but black stones for one state and white for the other
– Use multiple stones of two colors per square, saying more black than white means 0 and more white than black means 1
– Stone in center for one state, off-center for the other
– etc.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-21
Combining Bit Patterns
• Since we only have two patterns, we must combine them into sequences to create enough symbols to encode necessary information
• PandA has p=2 patterns, arranging them into n-length sequences, we can create 2n symbols
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-22
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-23
Hex Explained
• Recall in Chapter 4, we specified custom colors in HTML using hex digits– e.g., <font color ="#FF8E2A">
– Hex is short for hexadecimal, base 16
• Why use hex? Writing the sequence of bits is long, tedious, and error-prone
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-24
The 16 Hex Digits
• 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
• Sixteen digits (or hexits) can be represented perfectly by the 16 symbols of 4-bit sequences
• Changing hex digits to bits and back again:
– Given a sequence of bits, group them in 4's and write the corresponding hex digit
– Given hex, write the associated group of 4 digits
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-25
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-26
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-27