chapter 3 : data representation
Post on 09-Jan-2016
64 Views
Preview:
DESCRIPTION
TRANSCRIPT
Chapter Chapter 33: Data : Data RepresentationRepresentation
Types of DataTypes of Data
NumbersNumbers– 2324, -34.35, 34567890123.123452324, -34.35, 34567890123.12345
Characters and symbolsCharacters and symbols– A, B, C, … Z, a, b, c,… z, A, B, C, … Z, a, b, c,… z, – 0, 1, 2, 3 … 9, +, -, ), (, *, &, etc0, 1, 2, 3 … 9, +, -, ), (, *, &, etc
ImagesImages– Photos, charts, drawingsPhotos, charts, drawings
AudioAudio– Sound, music, etcSound, music, etc
VideoVideo– Video clips and moviesVideo clips and movies
InstructionsInstructions– Computer instructions are coded in sequences of 0’s Computer instructions are coded in sequences of 0’s
and 1’sand 1’s
Binary Number SystemBinary Number System
Cheapest and simplest in design and Cheapest and simplest in design and engineeringengineering
Switch: on Switch: on 11; off ; off 0 0 Circuit: voltages Circuit: voltages
– 1.7 volts – higher 1.7 volts – higher 11– 0.0 volts - 1.3 volts 0.0 volts - 1.3 volts 0 0– Voltages (1.3 to 1.7) are avoided in designVoltages (1.3 to 1.7) are avoided in design
Mathematics: binary numbersMathematics: binary numbers– Using digits 0 and 1 only.Using digits 0 and 1 only.
Decimal vs. BinaryDecimal vs. Binary
Decimal # systemDecimal # system– 10 symbols: 1, 2, 3,…9, 010 symbols: 1, 2, 3,…9, 0– Base = 10 (We have 10 fingers)Base = 10 (We have 10 fingers)– Decimal number 2324 reads “Decimal number 2324 reads “2 2
thousands 3 hundreds twenty four”.thousands 3 hundreds twenty four”. Binary # systemBinary # system
– 2 symbols: 0 and 12 symbols: 0 and 1– Base = 2Base = 2– Binary number 1101 = ?Binary number 1101 = ?
Decimal vs. BinaryDecimal vs. Binary
42 3 2 .
2*1000
3*100
2*10
4*1Each digit represents: 10
00100
10 1Position values: 103Position values (base):
102 101 100
Decimal # System:
11 1 0 .
1*8 1*4 0*2 1*1Each digit represents:
8 4 2 1Position values:
23Position values (base):
22 21 20
Binary # System:
Value in Decimal:
2*1000+3*100+2*10+4*1 = 2324D
Value in Decimal:
1*8+1*4+0*2+1*1 = 13D
Storage UnitsStorage Units
Binary digits – bitsBinary digits – bits 8 bits = 1 byte8 bits = 1 byte 2210 10 bytes = 1024 bytes =1 kilobytes = bytes = 1024 bytes =1 kilobytes =
1KB 1KB 2220 20 bytes = 2bytes = 210 10 KB = 1 megabytes = 1MBKB = 1 megabytes = 1MB 2230 30 bytes = 2bytes = 210 10 MB = 1 gigabytes = 1GBMB = 1 gigabytes = 1GB 2240 40 bytes = 2bytes = 210 10 GB = 1 terabytes = 1TBGB = 1 terabytes = 1TB
Representation of NumbersRepresentation of Numbers
Fixed-size-storage approach:Fixed-size-storage approach:– Computers allocate a specified amount of Computers allocate a specified amount of
space for a numberspace for a number IntegersIntegers
1 bit: 0 to 11 bit: 0 to 1 2 bits: 00, 01, 10, 11 2 bits: 00, 01, 10, 11 0 to 3 0 to 3 4 bits: 0000, 0001, 0010, … 1111 4 bits: 0000, 0001, 0010, … 1111 0 to 15 0 to 15 1 byte: 0 to 2551 byte: 0 to 255 2 bytes: -32768 to +327672 bytes: -32768 to +32767 4 bytes: -2,147,483,648 to +2,147,483,6474 bytes: -2,147,483,648 to +2,147,483,647Note: with 4 bytes for integers, any number Note: with 4 bytes for integers, any number
smaller than smaller than -2,147,648 -2,147,648 or larger than or larger than 2,147,483,6472,147,483,647 would be incorrectly would be incorrectly represented.,represented.,
Representation of Representation of NumbersNumbers
11 10 .
1*2 0*1 1*0.5
1*0.25
Each digit represents:
2 1 1/2 1/4Position values:
21Position values (base):
20 2-1 2-2
Binary # System:
Value in Decimal:
2 + ½ + ¼ + 1/8 = 2.875D
1
1*0.125
1/8
2-3
Binary representation of real numbers
Representation of NumbersRepresentation of Numbers
Floating-point numbers for real numbersFloating-point numbers for real numbers– Three parts of representation:Three parts of representation:
1.1. Sign (always 1 bits: 0 for + and 1 for -)Sign (always 1 bits: 0 for + and 1 for -)2.2. Significant digits (e.g., six bits)Significant digits (e.g., six bits)3.3. the power of 2 for the leftmost digit (e.g., 3 bits)the power of 2 for the leftmost digit (e.g., 3 bits)
– Example for binary -1111.01Example for binary -1111.01 Sign: 1 (negative)Sign: 1 (negative) Significant digits: 111101Significant digits: 111101BB
Power of 2: 011Power of 2: 011BB
– Example for binary +100.1101Example for binary +100.1101BB Sign: 0 (positive)Sign: 0 (positive) Significant digits: 100110Significant digits: 100110BB
– Note: the last digit is lost, which is 1/16 in decimalNote: the last digit is lost, which is 1/16 in decimal Power of 2: 010Power of 2: 010BB
Representation of NumbersRepresentation of Numbers
Single-precision floating-point numbersSingle-precision floating-point numbers1.1. Sign (always 1 bits: 0 for + and 1 for -)Sign (always 1 bits: 0 for + and 1 for -)2.2. Significant digits: 23 bitsSignificant digits: 23 bits3.3. exponent: 8exponent: 8
Double-precision floating-point numbersDouble-precision floating-point numbers1.1. Sign (always 1 bits: 0 for + and 1 for -)Sign (always 1 bits: 0 for + and 1 for -)2.2. Significant digits: 52 bitsSignificant digits: 52 bits3.3. exponent: 11exponent: 11
What you should know?What you should know?– Computers can represent numbers only in limited Computers can represent numbers only in limited
accuracy.accuracy. E.g., when you enter a E.g., when you enter a 20 digit20 digit decimal # into a program decimal # into a program
that uses single-precision, only that uses single-precision, only about 7 digitsabout 7 digits are actually are actually stored, the rest are lost.stored, the rest are lost.
– Real examples:Real examples: Designing aircraft on p.35Designing aircraft on p.35 The Vancouver Stock Exchange Index on pp. 38-39 The Vancouver Stock Exchange Index on pp. 38-39
Representation of Representation of NumbersNumbers// file: public_html/2005f-html/cil102/accuracy.c// file: public_html/2005f-html/cil102/accuracy.c#include <stdio.h>#include <stdio.h>
int main() {int main() { int x, y, result;int x, y, result; // x, y, and result all use 32 bits to represent integers (-2,147,648 to // x, y, and result all use 32 bits to represent integers (-2,147,648 to
+2,147,483,647)+2,147,483,647) char op;char op; int i;int i;
for (i = 0; i < 100; i++) {for (i = 0; i < 100; i++) { printf("please enter an expression:\n");printf("please enter an expression:\n"); scanf("%d %c %d", &x, &op, &y);scanf("%d %c %d", &x, &op, &y);
if (op == '+')if (op == '+') result = x + y;result = x + y; else if (op == '-')else if (op == '-') result = x - y;result = x - y; else {else { printf("Invalid operator!!");printf("Invalid operator!!"); break;break; }} printf("%d %c %d = %d\n", x, op, y, result);printf("%d %c %d = %d\n", x, op, y, result); }}}}// When you enter // When you enter 2000000000 + 5000000002000000000 + 500000000, the result is , the result is -1794967296-1794967296
Representation of NumbersRepresentation of Numbers
Variable-size-storage approach:Variable-size-storage approach:– Allow a wide-range of numbers to be Allow a wide-range of numbers to be
stored accuratelystored accurately– Needs significant more time to Needs significant more time to
processprocess– Fixed-size approach is used more Fixed-size approach is used more
common than variable-size common than variable-size approach.approach.
Representation of charactersRepresentation of characters
There are no visual letters A, B, C, etc stored There are no visual letters A, B, C, etc stored in computers like we have in mind.in computers like we have in mind.
Letters and symbols are encoded in 8 bits – Letters and symbols are encoded in 8 bits – one byte - of 0’s and 1’s.one byte - of 0’s and 1’s.– Keyboard converts keys A, B, C etc to their Keyboard converts keys A, B, C etc to their
corresponding codes and corresponding codes and – monitor converts the code into visual letters A, B, C monitor converts the code into visual letters A, B, C
etc on screen.etc on screen. Two commonly used coding schemes:Two commonly used coding schemes:
– ASCIIASCII: American Standard Code Information : American Standard Code Information InterchangeInterchange
– EBCDICEBCDIC: Extended Binary Coded Decimal : Extended Binary Coded Decimal Interchange CodeInterchange Code
Representation of Representation of characterscharacters
CharacterCharacter EBCDICEBCDIC ASCIIASCIIAA 1100000111000001 0100000101000001BB 1100001011000010 0100001001000010aa 1000000110000001 0110000101100001bb 1000001010000010 011000100110001000 1111000011110000 001100000011000011 1111000111110001 001100010011000122 1111001011110010 0011001000110010
, (comma), (comma) 0110101101101011 0010110000101100- (dash)- (dash) 0110000001100000 0010010100100101
Representation of Representation of characterscharacters
Foreign characters – two approachesForeign characters – two approaches– Use one byte per charUse one byte per char
Ex., Ex., – ISO-8859-1 for Western (Roman)ISO-8859-1 for Western (Roman)– ISO-8859-7 for GreekISO-8859-7 for Greek– ISO-2022-CN for simplified ChineseISO-2022-CN for simplified Chinese
Webpage: using “META charset=…” to specify Webpage: using “META charset=…” to specify which encoding is used.which encoding is used.
– Use two bytes per char/symbolsUse two bytes per char/symbols 16 bits have 65,536 combinations (characters)16 bits have 65,536 combinations (characters) Unicode coding systemUnicode coding system
Representation of ImagesRepresentation of Images
A picture is treated as a matrix of dots, called A picture is treated as a matrix of dots, called
pixelspixels..
Representation of ImagesRepresentation of Images
The pixels are so small and close The pixels are so small and close together we cannot really see together we cannot really see them as separate dots.them as separate dots.
Resolution: dots per inch (Resolution: dots per inch (dpidpi))– 72 dpi for Web images72 dpi for Web images– 600 or 1200 dpi for professional 600 or 1200 dpi for professional
printers or home photo printersprinters or home photo printers
Representation of ImagesRepresentation of Images
The color of each pixel is represented using bits.The color of each pixel is represented using bits. Black/WhiteBlack/White: one bit per pixel: one bit per pixel
– 1-white and 0-black1-white and 0-black Gray scaleGray scale: one byte per pixel: one byte per pixel
– 256 different degrees of gray (00000000 to 11111111)256 different degrees of gray (00000000 to 11111111)– 00000000 black, 01111111 intermediate gray, 11111111 00000000 black, 01111111 intermediate gray, 11111111
white white ColorColor: three bytes per pixel: three bytes per pixel
– Red, green, blue colorRed, green, blue color– One byte for the intensity of each of the three colorOne byte for the intensity of each of the three color– 256 possible red, 256 green, 256 blue256 possible red, 256 green, 256 blue
Pure red: 11111111 for red byte, 00000000 for green and bluePure red: 11111111 for red byte, 00000000 for green and blue White: 11111111 for all three bytesWhite: 11111111 for all three bytes Black: 00000000 for all three bytes Black: 00000000 for all three bytes
Representation of ImagesRepresentation of Images
Image storage -- sizeImage storage -- size Gray scaleGray scale: : one byteone byte per pixel per pixel
E.g., A 3 X 5 picture with 300 dpi resolutionE.g., A 3 X 5 picture with 300 dpi resolution 3 * 300 = 900 pixels per column3 * 300 = 900 pixels per column 5 * 300 = 1500 pixels per row5 * 300 = 1500 pixels per row 900 * 1500 = 1,350,000 pixels/picture900 * 1500 = 1,350,000 pixels/picture Needed storage = 1,350,000 bytes/picture = Needed storage = 1,350,000 bytes/picture =
1MB/picture1MB/picture ColorColor: : three bytesthree bytes per pixel per pixel
E.g., A 3 X 5 picture with 300 dpi resolutionE.g., A 3 X 5 picture with 300 dpi resolution 3 * 300 = 900 pixels per column3 * 300 = 900 pixels per column 5 * 300 = 1500 pixels per row5 * 300 = 1500 pixels per row 900 * 1500 = 1,350,000 pixels/picture900 * 1500 = 1,350,000 pixels/picture Needed storage = 3 (bytes per pixel) * 1,350,000 Needed storage = 3 (bytes per pixel) * 1,350,000 = 4,050,000 bytes/picture = 4,050,000 bytes/picture = 4MB/picture = 4MB/picture ------ TOO BIG TOO BIG
Representation of ImagesRepresentation of Images
Image compressionImage compression Color tableColor table
– Most pictures contain a small # of different colorsMost pictures contain a small # of different colors– Use a table to define colors that are actually used Use a table to define colors that are actually used
in the picture in the picture – Each pixel has an index to the Each pixel has an index to the color tablecolor table..– Each image contains a Each image contains a color tablecolor table and and table indicestable indices– ExampleExample
For a picture with For a picture with 100 different colors100 different colors, the color table would , the color table would contain contain 100 entries100 entries, three bytes each entry for each color. , three bytes each entry for each color. One byteOne byte can be used as index to the table for each pixel. can be used as index to the table for each pixel.
Representation of ImagesRepresentation of Images
Drawing commandsDrawing commands– Draw picture using basic commandsDraw picture using basic commands– Just as artists draws using a pencil or a Just as artists draws using a pencil or a
brush and other basic movements brush and other basic movements – Example,Example,
A house is drawn by sketching various A house is drawn by sketching various elements (doors, windows, walls), adding elements (doors, windows, walls), adding color to them, and moving to the desired color to them, and moving to the desired position.position.
Representation of ImagesRepresentation of Images
Data averaging or samplingData averaging or sampling– Condense the size by selecting a smaller Condense the size by selecting a smaller
collection of information to store.collection of information to store.– Many different ways of sampling and data Many different ways of sampling and data
averagingaveraging– An example: choose to store only every other An example: choose to store only every other
pixel in an image (pixel in an image (samplingsampling)– reducing the size to )– reducing the size to half. To display the full picture, the computer need half. To display the full picture, the computer need to fill in the missing data with, for example, the to fill in the missing data with, for example, the average of neighboring pixels (average of neighboring pixels (data averagingdata averaging))
– The resulting picture cannot be as sharp as the The resulting picture cannot be as sharp as the original original
– Lossy data compressionLossy data compression
Image FormatsImage Formats
Commonly used image file formats -1Commonly used image file formats -1– Bitmap (.bmp)Bitmap (.bmp)
Pixel-by-pixel storage of all color information for each Pixel-by-pixel storage of all color information for each pixel.pixel.
Lossless representationLossless representation Files are huge.Files are huge.
– Graphics Interchange Format (.gif)Graphics Interchange Format (.gif) Use one or more color tables – the Use one or more color tables – the color tablecolor table
techniquetechnique Each table contains 256 colors. Each table contains 256 colors. Suitable for pictures with a small # (<256) of Suitable for pictures with a small # (<256) of
different colors (e.g., organization charts)different colors (e.g., organization charts) Not suitable for pictures with shading (e.g., photos)Not suitable for pictures with shading (e.g., photos)
Image FormatsImage Formats
Commonly used image file formats - 2Commonly used image file formats - 2– PostScript (.ps)PostScript (.ps)
Employ the Employ the drawing commandsdrawing commands technique technique ““moveto” draws a line from current position to a new one moveto” draws a line from current position to a new one
and “arc” draws an arc given its center, radius, etcand “arc” draws an arc given its center, radius, etc General shapes can be used in multiple places General shapes can be used in multiple places Fonts can be reused.Fonts can be reused. Useful when the picture can be rendered as a drawing or its Useful when the picture can be rendered as a drawing or its
contains many of the same elements (e.g., text of the same contains many of the same elements (e.g., text of the same fonts)fonts)
– Joint Photographic Experts Group (JPEG) (.jpg)Joint Photographic Experts Group (JPEG) (.jpg) use the use the data averaging and samplingdata averaging and sampling on 8*8 pixel blocks on 8*8 pixel blocks User determines the level of details and clarityUser determines the level of details and clarity High-quality image – 8*8 blocks maintain their contentsHigh-quality image – 8*8 blocks maintain their contents Low-quality image – info in 8*8 blocks is discarded Low-quality image – info in 8*8 blocks is discarded smaller smaller
filesfiles
Comparison b/w jpg, gif, Comparison b/w jpg, gif, and psand ps
Comparison of .jpg and .gifComparison of .jpg and .gifhttp://www.siriusweb.com/tutorials/gifvsjpg/
More on .jpg and .gifMore on .jpg and .gifhttp://www.wfu.edu/~matthews/misc/jpg_vs_gif/JpgVsGif.htm
l
Summary of Image RepresentationsSummary of Image Representations
Other commonly used formatsOther commonly used formats– Tiff: Tagged Image File Format Tiff: Tagged Image File Format – PNG: Portable Network GraphicsPNG: Portable Network Graphics– New formats will emerge New formats will emerge
Understand the format and know Understand the format and know the pros and consthe pros and cons
To learn: Google the formatTo learn: Google the format Use programs (GIMP) to convert Use programs (GIMP) to convert
b/w formatsb/w formats
ADC and DACADC and DAC
ADC: Analog to Digital ConverterADC: Analog to Digital Converter
Use 8 bits to represent voltage 0 to 5 Use 8 bits to represent voltage 0 to 5 voltsvolts
Input = 5 volts, output = 1111 1111Input = 5 volts, output = 1111 1111 Input = 3 volts, output = 1001 0111Input = 3 volts, output = 1001 0111 Input = 0 volts, output = 0000 0000Input = 0 volts, output = 0000 0000
3 volts
5 volts 1111
1111
ADC 1001 0111
ADC and DACADC and DAC
DAC: Digital to Analog ConverterDAC: Digital to Analog Converter
3 volts
5 volts1111
1111
DAC1001 0111
Use 8 bits to represent voltage 0 to 5 Use 8 bits to represent voltage 0 to 5 voltsvolts
Input = 1111 1111, output = 5 voltsInput = 1111 1111, output = 5 volts Input = 1001 0111, output = 3 voltsInput = 1001 0111, output = 3 volts Input = 0000 0000, output = 0 voltsInput = 0000 0000, output = 0 volts
Analog AudioAnalog Audio
Sound wave
Digital Recording - 1 Digital Recording - 1
Digital Recording at low sample rate
Digital Replaying
Digital Recording - 2 Digital Recording - 2
Digital Recording at low high sampling rate
Digital Replaying
Music CDMusic CD
Sample rate: 44,100 Sample rate: 44,100 samples/secondsamples/second
#of bits for height: 16 bits#of bits for height: 16 bits # of channel: 2# of channel: 2 Total of bytes/sec:Total of bytes/sec:
44,100 samples/s x 2 bytes/sample x 2 channels44,100 samples/s x 2 bytes/sample x 2 channels
= 176,400 bytes/second= 176,400 bytes/second
Total of bytes on a 74 minute CDTotal of bytes on a 74 minute CD176,400 bytes/sec * 70 minutes * 60 seconds/minute176,400 bytes/sec * 70 minutes * 60 seconds/minute
= 783,216,000 => 783 MB = 783,216,000 => 783 MB
MP3 FormatMP3 Format
Compress the audio based on the Compress the audio based on the following:following:– People cannot hear sound at very low People cannot hear sound at very low
and very high frequenciesand very high frequencies– People hear loud sound, not the softer People hear loud sound, not the softer
one when there are two soundsone when there are two sounds– There are sounds humans hear better. There are sounds humans hear better.
Lossy FormatLossy Format
MP3 QualityMP3 Quality
Bit Rate: # of bits per second Bit Rate: # of bits per second encoded in MP3encoded in MP3
Bit Rate: 96 - 320 bit rateBit Rate: 96 - 320 bit rate QualityQuality
– 320 bit rate 320 bit rate humans cannot tell humans cannot tell difference from original music CDdifference from original music CD
– 120 bit rate 120 bit rate like hearing music on radio like hearing music on radio– 160 bit rate or higher 160 bit rate or higher for better for better
experienceexperience
Music CD to MP3 FilesMusic CD to MP3 Files
Music CD
Finest Quality
PCHard disk
Data CDMP3
RipperMP3
EncoderOr
Compresser
Listening to Music and Listening to Music and MP3MP3
Music CD
Finest Quality
Data CDMP3
Music CD
Player
MP3 Player
Suggested ReadingsSuggested Readings
1.1. How Analog and Digital How Analog and Digital Recording Works at Recording Works at http://electronics.howstuffworks.com/analog-digital.htm
1.1. How MP3 Files WorkHow MP3 Files Work at at http://computer.howstuffworks.com/mp31.htm
Summary – chapter 3Summary – chapter 3
Computers work in binaryComputers work in binary Integers may be constrained in sizeIntegers may be constrained in size Real numbers may have limited accuracyReal numbers may have limited accuracy Computations may produce roundoff errors, Computations may produce roundoff errors,
affecting accuracyaffecting accuracy Characters and languages are encoded in binaryCharacters and languages are encoded in binary Pictures are displayed pixel by pixelPictures are displayed pixel by pixel Color table, draw commands, and data Color table, draw commands, and data
averaging and sampling compression averaging and sampling compression techniquestechniques
.bmp, jpg, .gif, .ps formats.bmp, jpg, .gif, .ps formats Audio presentation: Music CD and MP3Audio presentation: Music CD and MP3
TerminologyTerminology
Binary vs. decimalBinary vs. decimal Position valuePosition value The base of a # systemThe base of a # system Bit/byte/KB/MB/GB/TBBit/byte/KB/MB/GB/TB Integer binary #sInteger binary #s Real # in binary Real # in binary Floating point numbersFloating point numbers Representational errorRepresentational error Roundoff errors Roundoff errors
ASCII/EBCDIC/UnicodeASCII/EBCDIC/Unicode PixelsPixels Dots per inch (dpi)Dots per inch (dpi) BitmapBitmap Color tableColor table Data averagingData averaging Data samplingData sampling Data compressionData compression .jpg, .bmp, .gif, .ps.jpg, .bmp, .gif, .ps
top related