1 lecture 4 data formats itec 1000 “introduction to information technology”

70
1 Lecture 4 Lecture 4 Data Formats Data Formats ITEC 1000 “Introduction to Information Technology”

Upload: wesley-andrews

Post on 13-Jan-2016

236 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

1

Lecture 4Lecture 4Data FormatsData Formats

ITEC 1000 “Introduction to Information Technology”

Page 2: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

2

Lecture Template:Lecture Template:

Data FormsData Forms Data conversion and representationData conversion and representation Data FormatsData Formats Alphanumeric DataAlphanumeric Data Image DataImage Data Audio DataAudio Data Data InputData Input Data CompressionData Compression Internal Computer Data Format Internal Computer Data Format

Page 3: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

3

Data FormsData Forms

Human communication Includes language, images and sounds

Computers Process and store all forms of data in binary format

Conversion to computer-usable representation using data formats

Define the different ways human data may be represented, stored and processed by a computer

Page 4: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

4

Data conversion and representationData conversion and representation

Page 5: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

5

Data formatsData formats

Proprietary formatsUnique to a product or companyE.g., Microsoft Word, Word Perfect

Standards (evolve in two ways):Proprietary formats become de facto standards (e.g., Adobe PostScript)Invented by an international standard organization (e.g., Motion Pictures Experts Group, MPEG)

Page 6: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

6

Common Data RepresentationsCommon Data Representations

Type of Data Standard(s)

Alphanumeric Unicode, ASCII, EDCDIC

Image (bitmapped) GIF (graphical image format)TIF (tagged image file format)PNG (portable network graphics)

Image (object) PostScript, JPEG, SWF (Macromedia Flash), SVG

Outline graphics and fonts

PostScript, TrueType

Sound WAV, AVI, MP3, MIDI, WMA

Page description PDF (Adobe Portable Document Format), HTML, XML

Video Quicktime, MPEG-2, RealVideo, WMV

Page 7: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

7

Alphanumeric DataAlphanumeric Data

Characters (r, T), number digits (0..9), punctuation (!, ;), special purpose characters ($, &)

Four codes/standards to represent letters and numbers:

BCD (Binary-Coded Decimal)UnicodeASCII (American Standard Code for Information Interchange)EBCDIC (Extended Binary Coded Decimal Interchange Code)

Page 8: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

8

Next 2 slides

Standard Alphanumeric FormatsStandard Alphanumeric Formats

BCD ASCII EBCDIC Unicode

Page 9: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

9

Binary-Coded Decimal (BCD)Binary-Coded Decimal (BCD)

Four bits per digit Digit Bit pattern

0 0000

1 0001

2 0010

3 0011

4 0100

5 0101

6 0110

7 0111

8 1000

9 1001

Note: the following 6 bit patterns are not used:

101010111100110111101111

Page 10: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

10

BCD: ExampleBCD: Example

709310 = ? (in BCD)

7 0 9 3

0111 0000 1001 0011

Page 11: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

11

Standard Alphanumeric FormatsStandard Alphanumeric Formats

Next 13 slides

BCD ASCII EBCDIC Unicode

Page 12: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

12

ASCII FeaturesASCII Features

Developed by ANSI (American National Standards Institute) Defined in ANSI document X3.4-1977 7-bit code 8th bit is unused (or used for a parity bit or to indicate

“extended” character set) 27 = 128 different codes Two general types of codes:

95 are “Printing” codes (displayable on a console)33 are “Control” codes (control features of the console or communications channel)

Represents Latin alphabet, Arabic numerals, standard punctuation characters Plus small set of accents and other European special characters (Latin-I ASCII)

Page 13: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

13

ASCII TableASCII Table

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 14: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

14

ASCII TableASCII Table

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Most significant bit

Least significant bit

Page 15: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

15

ASCII TableASCII Table

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

e.g., ‘a’ = 1100001

Page 16: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

16

ASCII TableASCII Table

95 Printing codes

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 17: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

17

ASCII TableASCII Table

33 Control codes

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 18: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

18

ASCII TableASCII Table

Alphabetic codes

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 19: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

19

ASCII TableASCII Table

Numeric codes

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 20: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

20

ASCII TableASCII Table

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Punctuation, etc.

Page 21: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

21

ASCII TableASCII Table

MSDLSD 0 1 2 3 4 5 6 7

0 NUL DLE SP 0 @ P p

1 SOH DC1 ! 1 A Q a W

2 STX DC2 “ 2 B R b r

3 ETX DC3 # 3 C S c s

4 EOT DC4 $ 4 D T d t

5 ENQ NAK % 5 E U e u

6 ACJ SYN & 6 F V f v

7 BEL ETB ‘ 7 G W g w

8 BS CAN ( 8 H X h x

9 HT EM ) 9 I Y i y

A LF SUB * : J Z j z

B VT ESC + ; K [ k {

C FF FS , < L \ l |

D CR GS - = M ] m }

E SO RS . > N ^ n ~

F SI US / ? O _ o DEL

7416

111 0100

Page 22: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

22

Example: Example: “Hello, world”“Hello, world”

============

Binary100100011001011101100110110011011110101100010000011101111100111111001011011001100100

Hexadecimal48656C6C6F2C207767726C64

Decimal72

1011081081114432

119103114108100

Hello, world

============

============

Page 23: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

23

Common Control CodesCommon Control Codes

CR 0D carriage return LF 0A line feed HT 09 horizontal tab DEL 7F delete NULL 00 null

Hexadecimal code

Page 24: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

24

ASCII Table: Common Control CodesASCII Table: Common Control Codes

000 001 010 011 100 101 110 1110000 NULL DLE 0 @ P ` p0001 SOH DC1 ! 1 A Q a q0010 STX DC2 " 2 B R b r0011 ETX DC3 # 3 C S c s0100 EDT DC4 $ 4 D T d t0101 ENQ NAK % 5 E U e u0110 ACK SYN & 6 F V f v0111 BEL ETB ' 7 G W g w1000 BS CAN ( 8 H X h x1001 HT EM ) 9 I Y i y1010 LF SUB * : J Z j z1011 VT ESC + ; K [ k {1100 FF FS , < L \ l |1101 CR GS - = M ] m }1110 SO RS . > N ^ n ~1111 SI US / ? O _ o DEL

Page 25: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

25

Standard Alphanumeric FormatsStandard Alphanumeric Formats

Next 3 slides

BCD ASCII EBCDIC Unicode

Page 26: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

26

EBCDICEBCDIC

8-bit code Developed by IBM IBM and compatible

mainframes only Rarely used today

(common in archival data)

Character codes differ from ASCII

Conversion software to/from ASCII available

ASCII EBCDIC

Space 2016 4016

A 4116 C116

b 6216 8216

Page 27: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

27

EBCDIC Table (1 out of 2)EBCDIC Table (1 out of 2)

Page 28: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

28

EBCDIC Table (2 out of 2)EBCDIC Table (2 out of 2)

Page 29: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

29

Standard Alphanumeric FormatsStandard Alphanumeric Formats

Next 2 slides

BCD ASCII EBCDIC Unicode

Page 30: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

30

UnicodeUnicode

Most common 16-bit form represents 65,536 characters

ASCII Latin-I subset of UnicodeValues 0 to 255 in Unicode table

Multilingual: defines codes for Nearly every character-based alphabetLarge set of ideographs for Chinese, Japanese and KoreanComposite characters for vowels and syllabic clusters required by some languages

Allows software modifications for local-languages

Page 31: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

31

Two-byte Unicode Assignment TableTwo-byte Unicode Assignment Table

Page 32: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

32

Collating SequenceCollating Sequence

Collating SequenceCollating Sequence – the order of the codes in the representation table

Determines sorting and selection of the alphanumeric data

Collating Sequences are different in ASCII and EBCDIC:

Small letters precede capitals in EBCDIC; reverse in ASCIINumbers collate first in ASCII; in EBCDIC, last

Page 33: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

33

Two Classes of CodesTwo Classes of Codes

Printing charactersProduced output on the screen or printer

Control charactersControl position of output on screen or printerCause action to occurCommunicate status between computer and I/O device

Page 34: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

34

Control Code Definitions (ASCII Table)Control Code Definitions (ASCII Table)

Page 35: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

35

Escape SequencesEscape Sequences

Extend the capability of the ASCII code set For controlling terminals and formatting output Defined by ANSI in documents X3.41-1974 and

X3.64-1977 The escape code is ESC = 1B16

An escape sequence begins with two codes: ESC [

1B16 5B16

Page 36: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

36

Escape Sequences: ExamplesEscape Sequences: Examples

Erase display: ESC [ 2 J Erase line: ESC [ K

Page 37: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

37

Alphanumeric Input: KeyboardAlphanumeric Input: Keyboard

Scan codeScan codeTwo different binary scan codes generated

when key is struck and when key is released

Converted to Unicode, ASCII or EBCDIC by software in terminal or PCReceived by the host as a stream of text and other characters, i.e. in the sequence typed

AdvantageEasily adapted to different languages or keyboard layoutSeparate scan codes for key press/release for multiple key combinations

Examples: shift and control keys

Page 38: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

38

Shift KeyShift Key

inhibits bit 5 in the ASCII code

Key(s)ASCII code

6 5 4 3 2 1 0 Character

1 1 0 0 0 0 1

1 0 0 0 0 0 1

a

A

a

aShift

Page 39: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

39

Control KeyControl Key

inhibits bits 5 & 6 in the ASCII code

Key(s)ASCII code

6 5 4 3 2 1 0 Character

1 1 0 0 0 1 1

0 0 0 0 0 1 1

c

ETX

c

cCtrl

Controlcode

Page 40: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

40

Keyboard InputKeyboard Input

Three letters are typed: “D”, “I”, “R”, followed by the carriage return

Four scan codes translated to ASCII binary codes: 1000100, 1001001, 1010010, 0001101

Page 41: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

41

OCR (optical character recognition)OCR (optical character recognition)

Scans text and inputs it as character data

Special OCR software required Used to read specially encoded

characters• Example: magnetically printed check

numbers

Attempts to recognize hand-written input (limited, only carefully printed)

Page 42: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

42

Bar Code ReadersBar Code Readers

Used in applications that require fast, accurate and repetitive input with minimal employee training

Examples: supermarket checkout counters and inventory control

Alphanumeric data in bar code (i.e., 780471 108801 90000) read optically using wand that converts them into electrical binary signals

A bar code translation module converts the binary input into a sequence of number codes , one code per digit, then translated to Unicode or ASCII.

Page 43: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

43

OtherOther Alphanumeric InputAlphanumeric Input

Magnetic stripe reader: alphanumeric data from credit cards

VoiceDigitized audio recording common but conversion to alphanumeric data difficult

Requires knowledge of sound patterns in a language (phonemes) plus rules for pronunciation, grammar, and syntax

Page 44: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

44

Image DataImage Data

Photographs, figures, icons, drawings, charts and graphs

Two approaches: Bitmap or raster images of photos and paintings with continuous variation (e.g., GIF, JPEG)Object or vector images composed of graphical shapes like lines and curves defined geometrically

Differences include:Quality of the imageStorage space required Time to transmitEase of modification

Page 45: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

45

Image InputImage Input

Image scanning (moves over the image converting dot by dot into a stream of binary numbers, pixels, representing black or white, or levels of gray, or of a colour) – bitmap image

Digital/video cameras – bitmap image Pointing devices (mouse, pen)- object

image

Page 46: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

46

Bitmap ImagesBitmap Images

Each individual pixel (pi(x)cture element) in a graphic stored as a binary number

Pixel: A small area with associated coordinate locationExample: each point below represented by a 4-bit code corresponding to 1 of 16 shades of gray

Page 47: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

47

Bitmap DisplayBitmap Display

Monochrome: black or white1 bit per pixel

Gray scale: black, white or 254 shades of gray

1 byte per pixel Color graphics: 16 colors, 256 colors, or

24-bit true color (16.7 million colors)4, 8, and 24 bits respectively

Page 48: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

48

Storing Bitmap ImagesStoring Bitmap Images

Frequently large filesExample: 600 rows of 800 pixels with 1 byte for each of 3 colors ~1.5MB file

File size affected byResolution (the number of pixels per inch)

Amount of detail affecting clarity and sharpness of an image

Levels: number of bits for displaying shades of gray or multiple colors

Palette: color translation table that uses a code for each pixel rather than actual color value

Data compression

Page 49: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

49

GIF GIF (Graphics Interchange Format)(Graphics Interchange Format)

First developed by CompuServe in 1987 GIF89a enabled animated images

allows images to be displayed sequentially at fixed time sequences

Color limitation: 256 Image compressed by LZW (Lempel-Zif-

Welch) algorithm Preferred for line drawings, clip art and

pictures with large blocks of solid color Lossless compression

Page 50: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

50

GIF GIF (Graphics Interchange Format)(Graphics Interchange Format)

Page 51: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

51

JPEG JPEG (Joint Photographers Expert Group)(Joint Photographers Expert Group)

Allows more than 16 million colors Suitable for highly detailed

photographs and paintings Employs special compression

algorithm that Discards data to decreases file size and transmission speedMay reduce image resolution, tends to distort sharp lines

Page 52: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

52

Other Bitmap FormatsOther Bitmap Formats

TIFF (Tagged Image File Format): .tif (pronounced tif)

Used in high-quality image processing, particularly in publishing

BMP (BitMaPped): .bmp (pronounced dot bmp)Device-independent format for Microsoft Windows environment: pixel colors stored independent of output device

PCX: .pcx (pronounced dot p c x)Windows Paintbrush software

PNG: (Portable Network Graphics): .png (pronounced ping)

Designed to replace GIF and JPEG for Internet applicationsPatent-freeImproved lossless compressionNo animation support

Page 53: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

53

Object ImagesObject Images

Created by drawing packages or output from spreadsheet data graphs

Composed of lines and shapes in various colors

Computer translates geometric formulas to create the graphic

Storage space depends on image complexity

number of instructions to create lines, shapes, fill patterns

Movies Shrek and Toy Story use object images

Page 54: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

54

Object ImagesObject Images

Based on mathematical formulasEasy to move, scale and rotate without losing shape and identity as bitmap images may

Require less storage space than bitmap images

Cannot represent photos or paintings Cannot be displayed or printed directly

Must be converted to bitmap since output devices except plotters are bitmap

Page 55: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

55

Popular Object Graphics SoftwarePopular Object Graphics Software

Most object image formats are proprietaryFiles extensions include .wmf, .dxf, .mgx, and .cgm

Macromedia Flash: low-bandwidth animation Micrographx Designer: technical drawings to

illustrate products CorelDraw: vector illustration, layout, bitmap

creation, image-editing, painting and animation software

Autodesk AutoCAD: for architects, engineers, drafters, and design-related professionals

W3C SVG (Scalable Vector Graphics) based on XML Web description language

Not proprietary

Page 56: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

56

PostScriptPostScript

Page description language: list of procedures and statements that describe each of the objects to be printed on a page

Stored in ASCII or Unicode text fileInterpreter program in computer or output device reads PostScript to generate image

Scalable font supportFont outline objects specified like other objects

Page 57: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

57

PostScript ProgramPostScript Program

Page 58: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

58

Representing Characters as Representing Characters as ImagesImages

Characters stored in format like Unicode or ASCII

Text processed and stored primarily for content Presentation requirements like font stored

with the character Text appearance is primary factorExample: screen fonts in Windows

Glyphs: Macintosh coding scheme that includes both identification and presentation requirement for characters

Page 59: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

59

Bitmap vs. Object ImagesBitmap vs. Object Images

Bitmap (Raster) Object (Vector)

Pixel map Geometrically defined shapes

Photographic quality Complex drawings

Paint software Drawing software

Larger storage requirements Higher computational requirements

Enlarging images produces jagged edges

Objects scale smoothly

Resolution of output limited by resolution of image

Resolution of output limited by output device

Page 60: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

60

Video ImagesVideo Images

Require massive amount of dataVideo camera producing full screen 640 x 480 pixel true color image at 30 frames/sec 27.65 MB of data/sec 1-minute film clip 1.6 GB storage

Options for reducing file size: decrease size of image, limit number of colors, reduce frame rate

Method depends on how video delivered to usersStreaming video: video displayed as it is downloaded from the Web server

Example: video conferencingLocal data (file on DVD or downloaded onto system) for higher quality

MPEG-2: movie quality images with high compression require substantial processing capability

Page 61: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

61

Audio DataAudio Data

Transmission and processing requirements less demanding than those for video

Waveform audio: digital representation of sound

MIDI (Musical Instrument Digital Interface): instructions to recreate or synthesize sounds

Analog sound converted to digital values by A-to-D converter

Page 62: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

62

Waveform AudioWaveform Audio

Sampling rate normally 50KHz

Page 63: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

63

Sampling RateSampling Rate

Number of times per second that sound is measured during the recording process.

1000 samples per second = 1 KHz (kilohertz)Example: Audio CD sampling rate = 44.1KHz

Height of each sample saved as:8-bit number for radio-quality recordings16-bit number for high-fidelity recordings2 x 16-bits for stereo

Page 64: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

64

MIDIMIDI

Music notation system that allows computers to communicate with music synthesizers

Instructions that MIDI instruments and MIDI sound cards use to recreate or synthesize sounds.

Do not store or recreate speaking or singing voicesMore compact than waveform3 minutes = 10 KB

Page 65: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

65

AudioAudio FormatsFormats

MP3 Derivative of MPEG-2 (ISO Moving Picture Experts Group)Uses psychoacoustic compression techniques to reduce storage requirementsDiscards sounds outside human hearing range: lossy compression

WAVDeveloped by Microsoft as part of its multimedia specificationGeneral-purpose format for storing and reproducing small snippets of sound

Page 66: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

66

..WAVWAV Sound Format Sound Format

Page 67: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

67

Data CompressionData Compression

Compression: recoding data so that it requires fewer bytes of storage space.

Compression ratio: the amount file is shrunk Lossless: inverse algorithm restores data to exact

original formExamples: GIF, PCX, TIFF

Lossy: trades off data degradation for file size and download speed

Much higher compression ratios, often 10 to 1Example: JPEG Common in multimedia

MPEG-2: uses both forms for ratios of 100:1

Page 68: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

68

Compression AlgorithmsCompression Algorithms

Repetition0 5 8 7 0 0 0 0 3 4 0 0 0 0 1 5 8 7 0 4 3 4 0 3Example: large blocks of the same color

Pattern Substitution Scans data for patternsSubstitutes new pattern, makes dictionary entryExample: 45 to 30 bytes plus dictionary

Peter Piper picked a peck of pickled peppers. t p a of l pp s.

Pe pi ed

er ck pe

Pi

Page 69: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

69

Internal Computer Data FormatInternal Computer Data Format

All data stored as binary numbers Interpreted based on

Operations computer can performData types supported by programming language used to create application

Page 70: 1 Lecture 4 Data Formats ITEC 1000 “Introduction to Information Technology”

70

Five Simple Data TypesFive Simple Data Types

Boolean: 2-valued variables or constants with values of true or false

Char: Variable or constant that holds alphanumeric character

EnumeratedUser-defined data types with possible values listed in definition

Type DayOfWeek = Mon, Tues, Wed, Thurs, Fri, Sat, Sun Integer: positive or negative whole numbers Real

Numbers with a decimal pointNumbers whose magnitude, large or small, exceeds computer’s capability to store as an integer