fundamentals of multimedia slide

Upload: bradford-gomez

Post on 14-Apr-2018

230 views

Category:

Documents


1 download

TRANSCRIPT

  • 7/30/2019 Fundamentals of Multimedia Slide

    1/304

    Multimedia Systems

    1

  • 7/30/2019 Fundamentals of Multimedia Slide

    2/304

    What is Multimedia?

    2

    When different people mention the term multimedia, they often have quitedifferent, or even opposing, viewpoints

    A PC vendor

    A PC that has sound capability, a DVD-ROM drive, and perhaps the superiority of

    multimedia-enabled microprocessors that understand additional multimedia

    instructions

    A consumer entertainment vendor

    interactive cable TV with hundreds of digital channels available, or a cable TV-like

    service delivered over a high-speed Internet connection.

    A Computer Science (CS) student

    Applications that use multiple modalities, including text, images, drawings (graphics),

    animation, video, sound including speech, and interactivity.

  • 7/30/2019 Fundamentals of Multimedia Slide

    3/304

    What is Multimedia?

    3

    Multimedia is Multiple forms of information content and information processing (e.g. text,

    audio, graphics, animation, video, interactivity) to inform or entertain

    Computer-controlled integration of text, graphics, drawings, still and moving

    images (Video), animation, audio, and any other media where every type of

    information can be represented, stored, transmitted and processed digitally

    Characterized by the processing, storage, generation, manipulation and

    rendition of Multimedia information

    Multimedia may be broadly divided into linear and non-linear categories. Linearactive content progresses without any navigation control for the viewer such as a

    cinema presentation. Non-linear content offers user interactivity to control progress as used with a computer

    game or used in self-paced computer based training.

  • 7/30/2019 Fundamentals of Multimedia Slide

    4/304

    History of Multimedia and Hypermedia

    Newspaper:

    perhaps the first mass communication medium, uses text, graphics, andimages

    Motion pictures:

    conceived of in 1830's in order to observe motion too rapid for perceptionby the human eye

    Wireless radio transmission: Guglielmo Marconi, at Pon-tecchio, Italy, in 1895

    Television:

    the new medium for the 20th century, established video as a commonlyavailable medium and has since changed the world of mass

    communications

    The connection between computers and ideas about multimediacovers only a short period

    4

  • 7/30/2019 Fundamentals of Multimedia Slide

    5/304

    Characteristics of a Multimedia System

    5

    A Multimedia system has four basic characteristics: Multimedia systems must be com puter con tro l led

    Multimedia systems are integrated

    The information they handle must be represented digi tal ly

    The interface to the final presentation of media is usually interact ive

  • 7/30/2019 Fundamentals of Multimedia Slide

    6/304

    Challenges of a Multimedia System

    6

    Very High Processing Power

    needed to deal with large data processing and real time delivery of media.

    Multimedia Capable File System needed to deliver real-time media -- e.g. Video/Audio Streaming. SpecialHardware/Software needed e.g RAID

    technology.

    Data Representations/File Formats that support multimedia Data representations/file formats should be easy to handle yet allow for compression/decompression in real-time

    Efficient and High I/O input and output to the file subsystem needs to be efficient and fast. Needs to allow for real-time recording as well as

    playback of data. e.g. Direct to Dis krecording systems.

    Special Operating System to allow access to file system and process data efficiently and quickly. Needs to support direct transfers to disk, real-

    time scheduling, fast interrupt processing, I/O streaming etc.

    Storage and Memory

    large storage units (of the order of 50 -100 Gb or more) and large memory (50 -100 Mb or more). Network Support

    Client-server systems common as distributed systems common.

    Software Tools user friendly tools needed to handle media, design and develop applications, deliver media.

  • 7/30/2019 Fundamentals of Multimedia Slide

    7/304

    Application of multimedia

    7

    Multimedia finds its application in various areas including

    Advertisements

    Art

    Education

    Entertainment

    Engineering Medicine

    Mathematics

    Business

    Scientific research and Spatial, temporal applications

  • 7/30/2019 Fundamentals of Multimedia Slide

    8/304

    Application of multimedia

    World Wide Web

    Hypermedia courseware

    Video conferencing

    Video-on-demand

    Interactive TV

    Groupware

    Home shopping

    Games

    Virtual reality

    Digital video editing and production systems

    Multimedia Database systems

    8

  • 7/30/2019 Fundamentals of Multimedia Slide

    9/304

    Topics

    9

    Issues in Multimedia (Authoring and Design)

    Multimedia authoring versus programming Difference between multimedia authoring and programming

    Multimedia application design

    Design stages, storyboarding

    Multimedia software tools

    Audio sequencing, image/graphics editing, animation, multimedia authoring Text

    Fonts and faces, Character set and alphabets, Font Editing and Design tools

    Images/graphics

    Digital images, Image data types, colors

    Audio Sound digitization, audio file formats

  • 7/30/2019 Fundamentals of Multimedia Slide

    10/304

    Topics contd

    10

    Video

    Text compression

    Image compression

    Audio Compression

    Video Compression

    Multimedia Hardware & Software

    Content-based Multimedia Retrieval

    Multimedia Network Communications

    Use of previous programming skill

  • 7/30/2019 Fundamentals of Multimedia Slide

    11/304

    Multimedia Authoring and Tools

    11

    Multimedia authoring: creation of multimedia productions, sometimescalled movies orpresentations

    Why should you use an authoring system?

    An authoring System has pre-programmed elements for the development of interactivemultimedia software titles.

    Authoring systems vary widely in orientation, capabilities, and learning curve.

    There is no such thing as a completely point-and-click automated authoring system

    authoring is actually just a speeded-up form of programming

    we are mostly interested in interactive applications

    we also have a look at still-image editors such as Adobe Photoshop, and simple videoeditors such as Adobe Premiere since they help to create interactive multimedia projects

    The interaction goes to from no interactivity to virtual reality creation

    Control the pace like click next, able to control the sequence, able to control the object

  • 7/30/2019 Fundamentals of Multimedia Slide

    12/304

    Multimedia Authoring

    Paradigms/methodology

    12

    Multimedia Authoring Metaphors

    Multimedia Production

    Multimedia Presentation

    Automatic Authoring

  • 7/30/2019 Fundamentals of Multimedia Slide

    13/304

    Multimedia Authoring Metaphors

    13

    Scripting Language Metaphor: use a special language to enable interactivity

    (buttons, mouse, etc.), and to allow conditionals, jumps, loops, functions/macrosetc. E.g., a smallToolbook program is as below:

  • 7/30/2019 Fundamentals of Multimedia Slide

    14/304

    Topics contd

    14

    Video

    Text compression

    Image compression

    Audio Compression

    Video Compression

    Content-based Multimedia Retrieval

    Use of previous programming skill

  • 7/30/2019 Fundamentals of Multimedia Slide

    15/304

    Multimedia Authoring andTools

  • 7/30/2019 Fundamentals of Multimedia Slide

    16/304

    Application of multimedia

    Hypermedia courseware

    Video conferencing

    Video-on-demand

    Interactive TV

    Groupware

    Home shopping

    Digital video editing and production systems

    Multimedia Database systems

    World Wide Web

    Games

    Virtual reality

    16

  • 7/30/2019 Fundamentals of Multimedia Slide

    17/304

    Multimedia Authoring

    Multimedia authoring: creation of multimedia productions,sometimes called movies or presentations. Authoring involves the assembly and bringing together of Multimedia

    with possibly high level graphical interface design and some highlevel scripting.

    Programming involves low level assembly and construction and

    control of Multimedia and involves real languages like C and Java.

    An authoring System has Pre-programmed elements for the development of interactive

    multimedia software

    Vary widely in orientation, capabilities, and learning curve There is no completely point-and-click automated authoring system

    A speeded-up form of programming, 1/8 of programmingdevelopment time

  • 7/30/2019 Fundamentals of Multimedia Slide

    18/304

    Multimedia Authoring

    Focus in interactive applications Why?

    The level of interaction goes from no interactivity to virtualreality creation

    Control the pace like click next, able to control thesequence, able to control the object, able to control the

    entire simulation It also includes image editors such as Adobe Photoshop, and

    simple video editors such as Adobe Premiere since they helpto create interactive multimedia projects

    In this section, we take a look at Multimedia Authoring Metaphors

    Multimedia application Production

    Automatic Authoring

    http://localhost/var/www/apps/conversion/tmp/scratch_3/Multimedia%20Systems/interactive%20features.pptxhttp://localhost/var/www/apps/conversion/tmp/scratch_3/Multimedia%20Systems/interactive%20features.pptx
  • 7/30/2019 Fundamentals of Multimedia Slide

    19/304

    Multimedia Authoring Metaphors

    1. Scripting Language Metaphor

    Use a special language to enable interactivity (buttons, mouse,etc.), and to allow conditionals, jumps, loops, functions/macros etc.

    Closest to programming

    Tend to be longer in development time

    Run time speed is minimal

    2. Iconic/Flow-control Metaphor Graphical icons are available in a toolbox, and authoring proceeds

    by creating a flow chart with icons attached

    Speediest in development time and suited for short time projects

    Suffer least from runtime speed problems 19

    global gNavSpriteon exitFramego the frameplay sprite gNavSpriteend

  • 7/30/2019 Fundamentals of Multimedia Slide

    20/304

    Fig. 2.1: Authorware flowchart

  • 7/30/2019 Fundamentals of Multimedia Slide

    21/304

    4. Hierarchical Metaphor Represented by embedded objects and iconic properties

    User-controllable elements are organized into a tree structure

    learning curve is non-trivial

    Often used in menu-driven applications

    5. Frames Metaphor

    Like Iconic/Flow-control Metaphor; however links between icons aremore conceptual, rather than representing the actual flow of theprogram

    This is a very fast development system but requires a good auto-debugging function

  • 7/30/2019 Fundamentals of Multimedia Slide

    22/304

    Fig. 2.2: Quest Frame

  • 7/30/2019 Fundamentals of Multimedia Slide

    23/304

    7. Cast/Score/Scripting Metaphor:

    Time is shown horizontally; like a spreadsheet:rows, ortracks, represent instantiations ofcharacters in a multimedia production.

    Multimedia elements are drawn from a cast ofcharacters, and scripts are basically event-procedures or procedures that are triggered bytimer events.

    Director, by Macromedia, is the chief example ofthis metaphor. Director uses the Lingo scriptinglanguage, an object-oriented event-drivenlanguage.

  • 7/30/2019 Fundamentals of Multimedia Slide

    24/304

    Multimedia Application Production

    the multimedia design phase consists of

    Storyboarding help to plan the general organization or content of a presentation by recording

    and organizing ideas on index cards, or placed on board/wall. Insure mediaare collected and organized

    Flowcharting

    Adds navigation information for the story board, the multimediaconcept structure and user interaction followed by detailfunctional requirement specification

    Prototyping and user testing

    parallel media production

    Two types of design considerations needs to be also made Multimedia content and technical design

  • 7/30/2019 Fundamentals of Multimedia Slide

    25/304

    Multimedia Content Design

    Content design deals with what to say and what vehicle to use.

    There are five ways to format and deliver your message. You can write it,illustrate it, wiggle it, hear it, and interact with it.

    Writing (Scripting) Understand your audience and correctly address them

    Keep your writing as simple as possible. (e.g., write out the full message(s)

    first, then shorten it Make sure technologies used complement each other

    Illustrate(Graphics) Make use of pictures to effectively deliver your messages.

    Create your own (draw, (color) scanner, PhotoCD, ...), or keep "copy files"of art works

    Graphic styles

    Fonts

    colors

  • 7/30/2019 Fundamentals of Multimedia Slide

    26/304

    Multimedia Content Design

    Graphics Styles: Human visual dynamics impact howpresentations must be constructed.

    (a) Color principles and guidelines: Some colorschemes and art styles are best combined with a

    certain theme or style. A general hint is to not use toomanycolors, as this can be distracting.

    (b) Fonts: For effective visual communication in apresentation, it is best to use large fonts (i.e., 18 to 36

    points), and no more than 6 to 8 lines per screen(fewer than on this screen!). Fig. 2.4 shows acomparison of two screen projections:

    26

  • 7/30/2019 Fundamentals of Multimedia Slide

    27/304

    Fig. 2.4: Colours and fonts [from Ron Vetter].

  • 7/30/2019 Fundamentals of Multimedia Slide

    28/304

    (c) A color contrast program: If the text color is some triple

    (R,G,B), a legible color for the background is that color subtracted from the maximum (here assuming max=1):

    (R, G,B)(1 R, 1 G, 1 B) (2.1)

    Some color combinations are more pleasing than others; e.g., a pink background and forest green foreground, or agreen background and mauve foreground. Fig. 2.5 showsa small VB program (textcolor.exe) in operation:

  • 7/30/2019 Fundamentals of Multimedia Slide

    29/304

    Fig. 2.5: Program to investigate colours andreadability.

  • 7/30/2019 Fundamentals of Multimedia Slide

    30/304

    Fig. 2.6: Colour wheel

    - Fig. 2.6, shows a colour wheel, with opposite coloursequal to (1-R, 1-G, 1-B)

  • 7/30/2019 Fundamentals of Multimedia Slide

    31/304

    wiggling (Animation)

    1. Types of animation

    Character Animation - humanize an object Highlights and Sparkles

    To pop a word in/out of the screen, to sparkle a logo

    Moving Text

    Video - live video or digitized video

    2. When to Animate Only animate when it has a specific purpose

    Enhance emotional impact

    Make a point

    Improve information delivery

    Indicate passage of time

    Provide a transition to next subsection

    31

  • 7/30/2019 Fundamentals of Multimedia Slide

    32/304

    Video Transitions

    Video transitions: to signal scene changes.

    Many different types of transitions:

    1. Cut: an abrupt change of image contentsformed by abutting two video framesconsecutively. This is the simplest and most

    frequently used video transition.

  • 7/30/2019 Fundamentals of Multimedia Slide

    33/304

    2. Wipe: a replacement of the pixels in a region ofthe viewport with those from another video. Wipes

    can be left-to-right, right-to-left, vertical, horizontal,like an iris opening, swept out like the hands of aclock, etc.

    3. Dissolve: replaces every pixel with a mixture over

    time of the two videos, gradually replacing the firstby the second. Most dissolves can be classified astwo types: cross dissolve and dither dissolve.

  • 7/30/2019 Fundamentals of Multimedia Slide

    34/304

    Type I: Cross Dissolve Every pixel is affected gradually. It can be

    defined by:

    (2.2)

    where A and B are the color 3-vectors forvideo A and video B. Here, (t) is a transitionfunction, which is often linear:

    (2.3)

    (1 ( )) ( )t t D A B

    ( ) , with 1maxt k t k t

  • 7/30/2019 Fundamentals of Multimedia Slide

    35/304

    Type II: Dither Dissolve

    Determined by (t), increasingly moreand more pixels in video A will abruptly(instead of gradually as in Type I) change

    to video B.

  • 7/30/2019 Fundamentals of Multimedia Slide

    36/304

    Fade-in and fade-out are special types of

    Type I dissolve: video A or B is black (orwhite). Wipes are special forms of Type IIdissolve in which changing pixels follow aparticular geometric pattern.

    Build-your-own-transition: Suppose wewish to build a special type of wipe whichslides one video out while another videoslides in to replace it: a slide (orpush).

  • 7/30/2019 Fundamentals of Multimedia Slide

    37/304

    (a) Unlike a wipe, we want each video frame not beheld in place, but instead move progressively fartherinto (out of) the viewport.

    (b) Suppose we wish to slide VideoL in from the left,and push out VideoR. Figure 2.9 shows this

    process:

    Fig. 2.9: (a): VideoL. (b): VideoR. (c): VideoL slidinginto place and pushing out VideoR.

  • 7/30/2019 Fundamentals of Multimedia Slide

    38/304

    Hearing (Audio)

    Types of audio in multimedia application

    Music - set the mood of the presentation, enhance the

    emotion, illustrate points

    Sound effects - to make specific points, e.g., squeaky

    doors, explosions, wind, ...

    Narration - most direct message, often effective

  • 7/30/2019 Fundamentals of Multimedia Slide

    39/304

    Interactivity (interacting)

    Interactive multimedia systems

    People remember 70% of what they interact with Menu driven programs/presentations

    -often a hierarchical structure (main menu, sub-menus, ...)

    Hypermedia

    less structured, cross-links between subsections of the samesubject, nonlinear, quick access to information +: easier forintroducing more multimedia features,

    Simulations / Performance-dependent Simulations

    e.g., Games - SimCity, Flight Simulators

  • 7/30/2019 Fundamentals of Multimedia Slide

    40/304

    Technical Design Issues 1. Computer Platform: Much software is ostensibly portable

    but cross-platform software relies on run-time modules whichmay not work well across systems.

    2. Video format and resolution: The most popular video

    formats NTSC, PAL, and SECAM are not compatible, soa conversion is required before a video can be played on aplayer supporting a different format.

    3. Memory and Disk Space Requirement: At least 128 MBof RAM and 20 GB of hard-disk space should be available foracceptable performance and storage for multimediaprograms.

  • 7/30/2019 Fundamentals of Multimedia Slide

    41/304

    4. Delivery Methods: Not everyone/everywhere has rewriteable DVD

    drives, as yet.

    CD-ROMs: may be not enough storage to holda multimedia presentation. As well, access time

    for CD-ROM drives is longer than for hard-diskdrives.

    Electronic delivery is an option, but depends on

    network bandwidth at the user side (and atserver). A streaming option may be available,depending on the presentation.

  • 7/30/2019 Fundamentals of Multimedia Slide

    42/304

    Automatic Authoring Hypermedia documents: Generally, three

    steps:

    1. Capture of media: From text or using an audio

    digitizer or video frame-grabber; is highly developedand well automated.

    2. Authoring: How best to structure the data in orderto support multiple views of the available data, rather

    than a single, static view.

    3. Publication: i.e. Presentation, is the objective ofthe multimedia tools we have been considering.

  • 7/30/2019 Fundamentals of Multimedia Slide

    43/304

    Externalization versus linearization:

    (a) Fig. 2.12(a) shows the essential problem involved incommunicating ideas without using a hypermediamechanism.

    (b) In contrast, hyperlinks allow us the freedom to partiallymimic the authors thought process (i.e., externalization).

    (c) Using, e.g., Microsoft Word, creates a hypertext versionof a document by following the layout already set up inchapters, headings, and so on. But problems arise whenwe actually need to automatically extract semantic

    content and find links and anchors (even considering justtext and not images etc.) Fig. 2.13 displays the problem.

    Multimedia Systems([email protected])

    43

  • 7/30/2019 Fundamentals of Multimedia Slide

    44/304

    Fig. 2.12: Communication using hyperlinks [from David Lowe].

    (a)

    (b)

  • 7/30/2019 Fundamentals of Multimedia Slide

    45/304

    (d) Once a dataset becomes large we should

    employ database methods. The issuesbecome focused on scalability (to a largedataset), maintainability, addition of material,and reusability.

    Fig. 2.13: Complex information space [from David Lowe].

  • 7/30/2019 Fundamentals of Multimedia Slide

    46/304

    Semi-automatic migration of hypertext

    The structure of hyperlinks for text information is simple: nodes represent semantic information and these are

    anchors for links to other pages.

    Fig. 2.14: Nodes and anchors in hypertext [from David Lowe].

  • 7/30/2019 Fundamentals of Multimedia Slide

    47/304

    Hyperimages

    We need an automated method to helpus produce true hypermedia:

    Fig. 2.15: Structure of hypermedia [from David Lowe].

  • 7/30/2019 Fundamentals of Multimedia Slide

    48/304

    Can manually delineate syntactic image elements bymasking image areas. Fig. 2.16 shows ahyperimage, with image areas identified and

    automatically linked to other parts of a document:

    Fig. 2.16: Hyperimage [from David Lowe].

  • 7/30/2019 Fundamentals of Multimedia Slide

    49/304

    2.2 Some Useful Editing and AuthoringTools

    One needs real vehicles for showing understandingprinciples of and creating multimedia. And straightprogramming in C++ or Java is not always the bestway of showing your knowledge and creativity.

    Some popular authoring tools include the following: Adobe Premiere 6 Macromedia Director 8 and MX Flash 5 and MX

    Dreamweaver MX

    Assignments for this section

  • 7/30/2019 Fundamentals of Multimedia Slide

    50/304

    2.2.1 Adobe Premiere

    2.2.2 Macromedia Director

    2.2.3 Macromedia Flash

    2.2.4 Dreamweaver

  • 7/30/2019 Fundamentals of Multimedia Slide

    51/304

    At the convergence of technology and creative

    invention in multimedia is virtual reality Placing inside a lifelike experience

    Take a step forward, and the view gets closer, turn

    your head, and the view rotates

    Reach out and grab an object; your hand moves in

    front of you maybe the object explodes in a 90-

    decibel crescendo as you wrap your fingers around it.

    Or it slips out from your grip, falls to the floor, and

    hurriedly escapes through a mouse hole at the bottom

    of the wall

  • 7/30/2019 Fundamentals of Multimedia Slide

    52/304

    In VR, your cyberspace is made up of many thousandsof geometric objects plotted in three-dimensional space

    The more objects and the more points that describe theobjects, the higher resolution and the more realistic yourview

    As the user moves about, each motion or action requiresthe computer to recalculate the position, angle size, andshape of all the objects that make up your view, andmany thousands of computations must occur as fast as30 times per second to seem smooth.

  • 7/30/2019 Fundamentals of Multimedia Slide

    53/304

    2.3 VRML (Virtual Reality ModellingLanguage)

    Overview

    (a) VRML: conceived in the first international conference of theWorld Wide Web as a platform-independent language thatwould be viewed on the Internet.

    (b) Objective of VRML: capability to put coloured objects intoa 3D environment.

    (c) VRML is an interpreted language; however it has been

    very influential since it was the first method available fordisplaying a 3D world on the World Wide Web.

  • 7/30/2019 Fundamentals of Multimedia Slide

    54/304

    History of VRML

    VRML 1.0 was created in May of 1995, with a revision for clarification called VRML 1.0C in January of 1996:

    VRML is based on a subset of the file inventor formatcreated by Silicon Graphics Inc.

    VRML 1.0 allowed for the creation of many simple 3Dobjects such as a cube and sphere as well as user-definedpolygons. Materials and textures can be specified for

    objects to make the objects more realistic.

  • 7/30/2019 Fundamentals of Multimedia Slide

    55/304

    The last major revision of VRML was VRML 2.0,standardized by ISO as VRML97:

    This revision added the ability to create an interactiveworld. VRML 2.0, also called Moving Worlds, allows foranimation and sound in an interactive virtual world.

    New objects were added to make the creation of virtual

    worlds easier.

    Java and Javascript have been included in VRML to allowfor interactive objects and user-defined actions.

    VRML 2.0 was a large change from VRML 1.0 and theyare not compatible with each other. However, conversionutilities are available to convert VRML 1.0 to VRML 2.0automatically.

  • 7/30/2019 Fundamentals of Multimedia Slide

    56/304

    VRML Shapes VRML contains basic geometric shapes that can be combined to

    create more complex objects. Fig. 2.28 displays some of theseshapes:

    Fig. 2.28: Basic VRML shapes.

    Shape node is a generic node for all objects in VRML.

    Material node specifies the surface properties of an object. It cancontrol what color the object is by specifying the red, green and bluevalues of the object.

  • 7/30/2019 Fundamentals of Multimedia Slide

    57/304

    There are three kinds of texture nodes thatcan be used to map textures onto any object:

    1. ImageTexture: The most common one that cantake an external JPEG or PNG image file andmap it onto the shape.

    2. MovieTexture: allows the mapping of a movieonto an object; can only use MPEG movies.

    3. PixelTexture: simply means creating an imageto use with ImageTexture within VRML.

  • 7/30/2019 Fundamentals of Multimedia Slide

    58/304

    VRML world

    Fig. 2.29 displays a simple VRML scene from one viewpoint: Openable-book VRML simple world!:

    The position of a viewpoint can be specified with the position

    node and it can be rotated from the default view with theorientation node.

    Also the cameras angle for its field of view can be changedfrom its default 0.78 radians, with the fieldOfView node.

    Changing the field of view can create a telephoto effect.

  • 7/30/2019 Fundamentals of Multimedia Slide

    59/304

    Fig. 2.29: A simple VRML scene.

  • 7/30/2019 Fundamentals of Multimedia Slide

    60/304

    Three types of lighting can be used in a VRML world:

    DirectionalLight node shines a light across the whole world in a

    certain direction.

    PointLight shines a light from all directions from a certain point inspace.

    SpotLight shines a light in a certain direction from a point.

    RenderMan: rendering package created by Pixar.

    The background of the VRML world can also be specified usingthe Background node.

    A Panorama node can map a texture to the sides of the world. Apanorama is mapped onto a large cube surrounding the VRMLworld.

  • 7/30/2019 Fundamentals of Multimedia Slide

    61/304

    Animation and Interactions The only method of animation in VRML is by tweening done by

    slowly changing an object that is specified in aninterpolator node.

    This node will modify an object over time, based on the six types ofinterpolators: color, coordinate, normal, orientation, position, andscalar.

    (a) All interpolators have two nodes that must be specified: the key andkeyValue.

    (b) The key consists of a list of two or more numbers starting with 0 andending with 1, defines how far along the animation is.

    (c) Each key element must be complemented with a keyValueelement: defines what values should change.

  • 7/30/2019 Fundamentals of Multimedia Slide

    62/304

    To time an animation, a TimeSensornode should be used:

    (a) TimeSensorhas no physical form in the VRML world and just keeps

    time.

    (b) To notify an interpolator of a time change, a ROUTE is needed toconnect two nodes together.

    (c) Most animation can be accomplished through the method of routinga TimeSensorto an interpolator node, and then the interpolator node

    to the object to be animated.

    Two categories of sensors can be used in VRML to obtain inputfrom a user:

    (a) Environment sensors: three kinds of environmental sensor nodes:VisibilitySensor, ProximitySensor, and Collision.

    (b) Pointing device sensors: touch sensor and drag sensors.

  • 7/30/2019 Fundamentals of Multimedia Slide

    63/304

  • 7/30/2019 Fundamentals of Multimedia Slide

    64/304

    (f) Nodes can be named using DEFand be used againlater

    by using the keyword USE. This allows for the creation of

    complex objects using many simple objects.

    A simple VRML example to create a box in VRML:one can accomplish this by typing:

    Shape {

    Geometry Box{}

    }

    The Box defaults to a 2-meter long cube in thecenter of the screen. Putting it into a Transformnodecan move this box to a different part of the scene. Wecan also give the box a different color, such as red.

  • 7/30/2019 Fundamentals of Multimedia Slide

    65/304

    Transform { translation 0 10 0 children

    [

    Shape {

    Geometry Box{}

    appearance Appearance {

    material Material {

    diffuseColor 1 0 0

    }

    }

    } ]}

  • 7/30/2019 Fundamentals of Multimedia Slide

    66/304

    Text

  • 7/30/2019 Fundamentals of Multimedia Slide

    67/304

    Introduction to Text

    Words and symbols in any form, spoken or written, are the

    most common system of communication

    Deliver the most widely understood meaning

    Typeface usually includes many type sizes and styles

    A font is a collection of characters of a single size andstyle belonging to a particular typeface family.

    Typical font styles are bold face and italic

    Other style attributes such as underlining and outlining ofcharacters, may be added at the users choice

    U f T t i M lti di

  • 7/30/2019 Fundamentals of Multimedia Slide

    68/304

    Text is used in multimedia projects in many ways

    Web pages

    Video

    Computer-based training

    Presentations

    Uses for Text in Multimedia

    U f T t i M lti di

  • 7/30/2019 Fundamentals of Multimedia Slide

    69/304

    Uses for Text in Multimedia

    Text is also used in multimedia projects in these ways.

    Games rely on text for rules, chat, characterdescriptions, dialog, background story, and manymore elements.

    Educational games rely on text for content, directions,

    feedback, and information. Kiosks use text to display information, directions, anddescriptions.

  • 7/30/2019 Fundamentals of Multimedia Slide

    70/304

    Formatting TextFormatting text controls the way the text looks.You can choose: Fonts

    Text sizes and colors

    Text alignment

    Text spacing: line spacing or spacing betweenindividual characters

    Advanced formatting: outlining, shadow, superscript,subscript, watermarks, embossing, engraving, or

    animation Text wraps

    T f

  • 7/30/2019 Fundamentals of Multimedia Slide

    71/304

    Typefaces Characterization of a typeface is serif and sans serif

    Serif

    Times, Times New Roman, Bookman

    Used for body of text

    F

    Sans serif

    Arial, Optima, Verdana Used for headings

    F

  • 7/30/2019 Fundamentals of Multimedia Slide

    72/304

    Guidelines for Using Fonts Avoid using many varying font styles in the same

    project. When possible, use fonts that come with both

    Windows and Mac OS.

    Use bitmap fonts on critical areas such as buttons,

    titles, or headlines.

  • 7/30/2019 Fundamentals of Multimedia Slide

    73/304

    More Tips for Using Fonts

    Use fancy or whimsical fonts sparinglyfor special effects or emphasis.

    Keep paragraphs and line lengths short.

    Use bold, italic, and underlining optionssparingly for emphasis.

    More Guidelines for Using

  • 7/30/2019 Fundamentals of Multimedia Slide

    74/304

    More Guidelines for UsingFonts

    Avoid using text in all uppercase letters. Use font, style options, size, and color

    consistently.

    Provide adequate contrast between textand background when choosing colors.

    Always check spelling and grammar.

  • 7/30/2019 Fundamentals of Multimedia Slide

    75/304

    Formatting for Screen DisplayApply these guidelines to multimedia

    applications for display rather than to printeddocuments. Test your presentation on monitors in several sizes.

    Avoid patterned backgrounds.

    Use small amounts of text on each screen display.

    Text for a presentation that will be viewed by a large group ofpeople must be visible from the back of the room.

    For interactive displays, use consistent placement of hypertextlinks.

    Character set and alphabets

  • 7/30/2019 Fundamentals of Multimedia Slide

    76/304

    Character set and alphabets

    ASCII Character set Uses 8 bit characters

    Numeric of value to 128 characters including bothlower and uppercase letters, punctuation marks, Arabicnumbers and math symbols.

    32 control characters for device control messages, suchas carriage return, line feed, tab and form feed.

    ASCII extended character set also uses 8 bits

    Character set and alphabets

  • 7/30/2019 Fundamentals of Multimedia Slide

    77/304

    Character set and alphabets

    UNICODE Character set Use 16-bit architecture for multilingual text and

    character encoding.

    Unicode uses about 65,000 characters from all knownlanguages and alphabets in the world.

    Several languages share a set of symbols that have ahistorically related derivation, the shared symbols ofeach language are unified ymbols (Called scripts).

  • 7/30/2019 Fundamentals of Multimedia Slide

    78/304

    Font TechnologiesUnderstanding font technologies can be important when

    creating multimedia projects. The most popular fonttechnologies are:

    Scalable fonts: Postscript, TrueType, and OpenType

    Bitmap fonts which are not scalable but provide morecontrol over the appearance of text.

    Font Editing and Design tools

  • 7/30/2019 Fundamentals of Multimedia Slide

    79/304

    Font Editing and Design tools

    In some multimedia projects it may be required to create

    special characters. Using the font editing tools it is possible to create a special

    symbols and use it in the entire text.

    Software that can be used for editing and creating fonts

    Fontographer

    Fontmonger

    Cool 3D text

  • 7/30/2019 Fundamentals of Multimedia Slide

    80/304

    Graphics and Image Data

    Representations

    Why use Images?

  • 7/30/2019 Fundamentals of Multimedia Slide

    81/304

    Why use Images?

    To show information that is visual and cant be easily

    communicated except as an image, for instance, maps,charts or diagrams

    To clarify interpretation of information by applying colorschemes or other visuals that help make meaning more

    obvious To create an evident context for information by using

    images that your audience can associate with your tone ormessage

    Bitmap/Vector images

  • 7/30/2019 Fundamentals of Multimedia Slide

    82/304

    Bitmap/Vector images In a bitmap or raster image, visual data is mapped as spots of

    color or pixels. The more pixels in a bitmap image, the finer the detail will be

    Because photographs have high levels of detail and a varietyof tones and colors, they are best represented as bitmap

    images. Scanners and digital cameras produce bitmapimages

    Vector or object oriented graphics use mathematical formulasto describe outlines and fills for image objects

    Vector graphics can be enlarged or reduced with no loss ofdata and no change in image quality e.g. CorelDraw,Illustrator, Freehand, AutoCAD & Flash create vector images

    Wh t i i

  • 7/30/2019 Fundamentals of Multimedia Slide

    83/304

    What is image For digitizing image, the image is discredited both in terms

    spatial co-ordinates and its amplitude values Discretization of the spatial coordinates (x,y) is called

    image sampl ing

    Descretization of the amplitude values f ( x, y) is called

    grey- level quantizat ion , intensi ty

    A digital image is represented by a matrix of numericvalues each representing a quantized intensity value

    When I is a two-dimensional matrix, then I(r,c) is the

    intensity value at the position corresponding to row r andcolumn c of the matrix

    Image

  • 7/30/2019 Fundamentals of Multimedia Slide

    84/304

    Image

    Each element of the array is called pixels

    Pixel Neighbours

  • 7/30/2019 Fundamentals of Multimedia Slide

    85/304

    Pixel Neighbours A pixelp1 is a neighbour of another pixel p2, if their spatial

    coordinates (x1,y1) and (x1,y2) are not more than a unitdistance apart. Types of neighbours often used in imageprocessing: Horizontal neighbours

    Vertical neighbours Diagonal neighbours

    Ari thm etic and Log ic Operat ions Addition:p1+p1used in image averaging

    Subtraction:p1-p2 used in image motion analysis, and backgroundremoval

    Multiplication:p1*p2 used in colour and image shading operations

    Divisionp1/p2 used in colour processing,

    C l

  • 7/30/2019 Fundamentals of Multimedia Slide

    86/304

    Color Reflection of light is simply the bouncing of light waves from

    an object back toward the lights source or other directions Energy is often absorbed from the light (and converted into

    heat or other forms) when the light reflects off an object, sothe reflected light might have slightly different properties

    light is the portion of electromagnetic radiation tat is visible tothe human eye

    Visible light has a wavelength of about 400 to 780nanometers

    The adjacent frequencies of infrared on the lower end andultraviolet on the higher end are still called light, even thoughthey are not visible to the human eye

    C l

  • 7/30/2019 Fundamentals of Multimedia Slide

    87/304

    Color Cameras store and reproduce light as images and video

    The device consists of a box with a hole in one side

    Light from an external scene passes through the hole and

    strikes a surface inside where it is reproduced, upside-down, but with both color and perspective preserved

    At first, the image is projected in light-sensitive chemical

    plates, later it became chemical film, and now it is

    photosensitive electronics that can record images in a

    digital format

    Human Color perception

  • 7/30/2019 Fundamentals of Multimedia Slide

    88/304

    Human Color perception The retina contains two types of light-sensitive

    photoreceptors: rods and cones. The rods are responsible for monochrome perception,

    allowing the eyes to distinguish between black and white.

    The cones are responsible for color vision.

    In humans, there are three types of cones: maximallysensitive to long-wavelength, medium-wavelength, andshort-wavelength light or Red, Green and Blue

    The color perceived is the combined effect of stimuli to

    these three types of cone cells. Overall there are morerods than cones, so color perception is less accurate thanblack and white contrast perception.

    M h I

  • 7/30/2019 Fundamentals of Multimedia Slide

    89/304

    Monochrome Images Each pixel is stored as a single bit (0 or 1), so also referred to as

    binary image Such an image is also called a 1-bit monochrome image since

    it contains no color

    A 640 x 480 monochrome image requires 37.5 KB of storage.

  • 7/30/2019 Fundamentals of Multimedia Slide

    90/304

    8-bit Gray-level Images

  • 7/30/2019 Fundamentals of Multimedia Slide

    91/304

    8-bit Gray-level Images Each pixel has a gray-value between 0 and 255.

    Each pixel is represented by a single byte; e.g., a dark pixel mighthave a value of 10, and a bright one might be 230.

    A 640 x 480 grayscale image requires over 300 KB of storage.

    8-Bit Colour Image

    One byte for each pixel

    Supports 256 out of the millions s possible, acceptable colour quality

    Requires Colour Look-Up Tables (LUTs)

    A 640 x 480 8-bit colour image requires 307.2 KB of storage (thesame as 8-bit greyscale)

  • 7/30/2019 Fundamentals of Multimedia Slide

    92/304

  • 7/30/2019 Fundamentals of Multimedia Slide

    93/304

    24 bit Color Images

  • 7/30/2019 Fundamentals of Multimedia Slide

    94/304

    24-bit Color Images Each pixel is represented by three bytes (e.g., RGB)

    Supports 256 x 256 x 256 possible combined colours(16,777,216)

    A 640 x 480 24-bit colour image would require 921.6 KBof storage

    Most 24-bit images are 32-bit images, the extra byte ofdata for each pixel is used to store an alpha valuerepresenting special effect information

  • 7/30/2019 Fundamentals of Multimedia Slide

    95/304

    95

    Assignment

  • 7/30/2019 Fundamentals of Multimedia Slide

    96/304

    Assignment How do CRT monitors create images

    How do Flat panel displays create images

    How do scanners digitize image

    How do printers create image with colors

    How can we create 3D images Briefly describe the different image formats,

    GIF,JPEG,PNG,TIFF

  • 7/30/2019 Fundamentals of Multimedia Slide

    97/304

    Image resolution refers to the number of pixels in a digitalimage (higher resolution always yields better quality).

    - Fairly high resolution for such an image might be 1,600 x1,200,whereas lower resolution might be 640 x 480.

    Frame buffer: Hardware used to store bitmap.

    - Video card (actually a graphics card) is used for this purpose.

    - The resolution of the video card does not have to match thedesired resolution of the image, but if not enough video cardmemory is available then the data has to be shifted around in RAMfor display.

    8-bit image can be thought of as a set of 1-bit bit-planes, whereeach plane consists of a 1-bit representation of the image at

    higher and higher levels of elevation: a bit is turned on if theimage pixel has a nonzero value that is at or above that bit level.

    Fig. 3.2 displays the concept of bit-planes graphically.

  • 7/30/2019 Fundamentals of Multimedia Slide

    98/304

    Fig. 3.2: Bit-planes for 8-bit grayscale image.

    3 2 Popular File Formats

  • 7/30/2019 Fundamentals of Multimedia Slide

    99/304

    3.2 Popular File Formats

    8-bit GIF : one of the most important formats because of itshistorical connection to the WWW and HTML markup language asthe first image type recognized by net browsers.

    JPEG: currently the most important common file format.

    GIF

  • 7/30/2019 Fundamentals of Multimedia Slide

    100/304

    GIF GIF standard: (We examine GIF standard because it is so

    simple! yet contains many common elements.) Limited to 8-bit (256) color images only, which, whileproducing acceptable color images, is best suited for imageswith few distinctive colors (e.g., graphics or drawing).

    GIF standard supports interlacing successive display ofpixels in widely-spaced rows by a 4-pass display process.

    GIF actually comes in two flavors:

    1. GIF87a: The original specification.

    2. GIF89a: The later version. Supports simple animation via aGraphics Control Extension block in the data, provides simplecontrol over delay time, a transparency index, etc.

    GIF87

  • 7/30/2019 Fundamentals of Multimedia Slide

    101/304

    GIF87 For the standard specification, the general file format of a GIF87

    file is as in Fig. 3.12.

    Fig. 3.12: GIF fileformat.

    Screen Descriptorcomprises a set of attributes that belong to

  • 7/30/2019 Fundamentals of Multimedia Slide

    102/304

    p p gevery image in the file. According to the GIF87 standard, it isdefined as in Fig. 3.13.

    Fig. 3.13: GIF screen descriptor.

    Color Map is set up in a very simple fashion as in Fig. 3.14.

  • 7/30/2019 Fundamentals of Multimedia Slide

    103/304

    p p y p gHowever, the actual length of the table equals 2(pixel+1) as givenin the Screen Descriptor.

    Fig. 3.14: GIF color map.

    Each image in the file has its own Image Descriptor, defined as

  • 7/30/2019 Fundamentals of Multimedia Slide

    104/304

    g gin Fig. 3.15.

    Fig. 3.15: GIF image descriptor.

    If the interlace bit is set in the local Image Descriptor, then the

  • 7/30/2019 Fundamentals of Multimedia Slide

    105/304

    rows of the image are displayed in a four-pass sequence(Fig.3.16).

    Fig. 3.16: GIF 4-pass interlace display row order.

  • 7/30/2019 Fundamentals of Multimedia Slide

    106/304

    We can investigate how the file header works in practice byhaving a look at a particular GIF image. Fig. 3.7 on page is an 8-

    bit color GIF image, in UNIX, issue the command:od -c forestfire.gif | head -2

    and we see the first 32 bytes interpreted as characters:

    G I F 8 7 a \208 \2 \188 \1 \247 \0 \0 \6 \3 \5

    J \132 \24 | ) \7 \198 \195 \ \128 U \27 \196 \166 & T

    To decipher the remainder of the file header (after GIF87a), weuse hexadecimal:

    od -x forestfire.gif | head -2

    with the result

    4749 4638 3761 d002 bc01 f700 0006 0305 ae84 187c 2907 c6c3 5c80

    551b c4a6 2654

    JPEG

  • 7/30/2019 Fundamentals of Multimedia Slide

    107/304

    JPEG JPEG: The most important current standard for image

    compression.

    The human vision system has some specific limitations and JPEGtakes advantage of these to achieve high rates of compression.

    JPEG allows the user to set a desired level of quality, or

    compression ratio (input divided by output).

    As an example, Fig. 3.17 shows ourforestfire image, with a qualityfactor Q=10%. - This image is a mere 1.5% of the original size. In comparison, a JPEG

    image with Q=75% yields an image size 5.6% of the original, whereasa GIF version of this image compresses down to 23.0% ofuncompressed image size.

  • 7/30/2019 Fundamentals of Multimedia Slide

    108/304

    Fig. 3.17: JPEG image with low quality specified by user.

    PNG

  • 7/30/2019 Fundamentals of Multimedia Slide

    109/304

    PNG PNG format: standing forPortable Network Graphics

    meant to supersede the GIF standard, and extends it inimportant ways.

    Special features of PNG files include:

    1. Support for up to 48 bits of color information a largeincrease.

    2. Files may contain gamma-correction information for correctdisplay of color images, as well as alpha-channel information forsuch uses as control of transparency.

    3. The display progressively displays pixels in a 2-dimensionalfashion by showing a few pixels at a time over seven passesthrough each 8 8 block of an image.

    TIFF

  • 7/30/2019 Fundamentals of Multimedia Slide

    110/304

    TIFF TIFF: stands forTagged Image File Format.

    The support for attachment of additional information (referred toas tags) provides a great deal of flexibility.

    1. The most important tag is a format signifier: what type ofcompression etc. is in use in the stored image.

    2. TIFF can store many different types of image: 1-bit,grayscale, 8-bit color, 24-bit RGB, etc.

    3. TIFF was originally a lossless format but now a new JPEG

    tag allows one to opt for JPEG compression.

    4. The TIFF format was developed by the Aldus Corporation inthe 1980's and was later supported by Microsoft.

    EXIF

  • 7/30/2019 Fundamentals of Multimedia Slide

    111/304

    EXIF

    EXIF (Exchange Image File) is an image format for digital

    cameras:

    1. Compressed EXIF files use the baseline JPEG format.

    2. A variety of tags (many more than in TIFF) are available to facilitatehigher quality printing, since information about the camera and picture-

    taking conditions (flash, exposure, light source, white balance, type ofscene, etc.) can be stored and used by printers for possible color correctionalgorithms.

    3. The EXIF standard also includes specification of file format for audio thataccompanies digital images. As well, it also supports tags for information

    needed for conversion to FlashPix (initially developed by Kodak).

  • 7/30/2019 Fundamentals of Multimedia Slide

    112/304

    Audio

    Sound

  • 7/30/2019 Fundamentals of Multimedia Slide

    113/304

    What is Sound?

    Sound is a wave phenomenon like light, but is macroscopic and

    involves molecules of air being compressed and expanded

    under the action of some physical device.

    (a) For example, a speaker in an audio system vibrates back

    and forth and produces a longitudinalpressure wave that we

    perceive as sound.

    (b) Since sound is a pressure wave, it takes on continuous

    values, as opposed to digitized ones.

  • 7/30/2019 Fundamentals of Multimedia Slide

    114/304

    (C) If we wish to use a digital version of sound waves we

    must form digitized representations of audio information.

    The perception of sound in any organism is limited to a

    certain range of frequencies(20Hz~20000Hz for humans)

    Infrasound; Elephants

    Ultrasound; Bat

    Digitization of Sound

  • 7/30/2019 Fundamentals of Multimedia Slide

    115/304

    Digitization means conversion to a stream of numbers,

    and preferably these numbers should be integers for

    efficiency.

    Example of Sound

    http://localhost/var/www/apps/conversion/tmp/scratch_3/Breakaway%20-%20Audio%20processing.mp4http://localhost/var/www/apps/conversion/tmp/scratch_3/Breakaway%20-%20Audio%20processing.mp4
  • 7/30/2019 Fundamentals of Multimedia Slide

    116/304

    Fig. 6.1: An analog signal: continuousmeasurement of pressure wave.

  • 7/30/2019 Fundamentals of Multimedia Slide

    117/304

    The graph in Fig. 6.1 has to be made digital in both time andamplitude. To digitize, the signal must be sampled in eachdimension: in time, and in amplitude. (a) Sampling means measuring the quantity we are interested

    in, usually at evenly-spaced intervals.

    (b) The first kind of sampling, using measurements only at

    evenly spaced time intervals, is simply called, sampling. The rateat which it is performed is called the sampling frequency

    (c) For audio, typical sampling rates are from 8 kHz (8,000samples per second) to 48 kHz. This range is determined by theNyquist theorem, discussed later.

    (d) Sampling in the amplitude or voltage dimension is calledquantization.

  • 7/30/2019 Fundamentals of Multimedia Slide

    118/304

    Fig. 6.2: Sampling and Quantization. (a): Sampling theanalog signal in the time dimension. (b): Quantization issampling the analog signal in the amplitude dimension.

    (a) (b)

    Few Terminologies

  • 7/30/2019 Fundamentals of Multimedia Slide

    119/304

    Regardless of what vibrating object is creating the sound

    wave, the particles of the medium through which the soundmoves is vibrating in a back and forth motion at a givenfrequency.

    The frequency of a wave refers to how often the particles ofthe medium vibrate when a wave passes through themedium. The frequency of a wave is measured as thenumber of complete back-and-forth vibrations of a particle ofthe medium per unit of time. If a particle of air undergoes1000 longitudinal vibrations in 2 seconds, then the

    frequency of the wave would be 500 vibrations per second.A commonly used unit for frequency is the Hertz(abbreviated Hz), where

    1 Hertz = 1 vibration/second

    Few Terminologies Contd

  • 7/30/2019 Fundamentals of Multimedia Slide

    120/304

    Few Terminologies Contd

  • 7/30/2019 Fundamentals of Multimedia Slide

    121/304

    g The sensation of a frequency is commonly referred to as the

    pitch A high pitch sound corresponds to a high frequency sound

    wave and a low pitch sound to a low frequency sound wave.

    Musically trained people are capable of detecting a difference

    in frequency between two separate sounds that is as little as 2Hz and common people understand 7Hz

    Any two sounds whose frequencies make a 2:1 ratio are said tobe separated by an octave

    Fourier Series

  • 7/30/2019 Fundamentals of Multimedia Slide

    122/304

    The representation of periodic function as infinite sum of

    sinusoidal

    Harmonics: any series of musical tones whose frequenciesare integral multiples of the frequency of a fundamental tone

  • 7/30/2019 Fundamentals of Multimedia Slide

    123/304

    Fig. 6.3: Building up a complex signal by superposingsinusoids

  • 7/30/2019 Fundamentals of Multimedia Slide

    124/304

    Digitization

  • 7/30/2019 Fundamentals of Multimedia Slide

    125/304

    Thus to decide how to digitize audio data

    we need to answer the following questions:1. What is the sampling rate?

    2. How finely is the data to be quantized, and is

    quantization uniform?

    Nyquist theorem

  • 7/30/2019 Fundamentals of Multimedia Slide

    126/304

    The Nyquist theorem states how frequently we must samplein time to be able to recover the original sound.

    (a) Fig. 6.4(a) shows a single sinusoid: it is a single, pure,frequency (only electronic instruments can create suchsounds).

    (b) If sampling rate just equals the actual frequency, Fig. 6.4(b)

    shows that a false signal is detected: it is simply a constant, withzero frequency.

    (c) Now if sample at 1.5 times the actual frequency, Fig. 6.4(c)shows that we obtain an incorrect (alias) frequency that is lowerthan the correct one it is half the correct one (the wavelength,

    from peak to peak, is double that of the actual signal).

    (d) Thus for correct sampling we must use a sampling rate equalto at least twice the maximum frequency content in the signal.This rate is called the Nyquist rate.

  • 7/30/2019 Fundamentals of Multimedia Slide

    127/304

    Fig. 6.4: Aliasing.

    (a): A single frequency.

    (b): Sampling at exactly the frequencyproduces a constant.

    (c): Sampling at 1.5 times per cycle

    produces an alias perceived frequency.

  • 7/30/2019 Fundamentals of Multimedia Slide

    128/304

    Nyquist Theorem: If a signal is band-limited, i.e.,there is a lower limitf

    1

    and an upper limit f2

    offrequency components in the signal, then the samplingrate should be at least 2(f2f1).

    Nyquist frequency: half of the Nyquist rate.

    Since it would be impossible to recover frequencies higherthan Nyquist frequency in any event, most systems havean antialiasing filterthat restricts the frequency content inthe input to the sampler to a range at or below Nyquistfrequency.

    The relationship among the Sampling Frequency, TrueFrequency, and the Alias Frequency is as follows:falias = fsampling ftrue, forftrue < fsampling< 2 ftrue (6.1)

  • 7/30/2019 Fundamentals of Multimedia Slide

    129/304

    In general, the apparent frequency of a sinusoid is thelowest frequency of a sinusoid that has exactly the

    same samples as the input sinusoid. Fig. 6.5 showsthe relationship of the apparent frequency to the inputfrequency.

    Fig. 6.5: Folding of sinusoid frequency which issampled at 8,000 Hz. The folding frequency, shown

    dashed, is 4,000 Hz.

    1HZ Wave Frequency

  • 7/30/2019 Fundamentals of Multimedia Slide

    130/304

    Sampling at 2Hz

  • 7/30/2019 Fundamentals of Multimedia Slide

    131/304

    Sampling at 3Hz

  • 7/30/2019 Fundamentals of Multimedia Slide

    132/304

    Sampling at 1.5Hz

  • 7/30/2019 Fundamentals of Multimedia Slide

    133/304

    Sampling with 3HZ Frequency

  • 7/30/2019 Fundamentals of Multimedia Slide

    134/304

  • 7/30/2019 Fundamentals of Multimedia Slide

    135/304

    Aliasing

  • 7/30/2019 Fundamentals of Multimedia Slide

    136/304

    Exercise

  • 7/30/2019 Fundamentals of Multimedia Slide

    137/304

    If sampling rate is 4000HZ, what is the frequency of a

    sine wave If the highest frequency is 4000HZ, what is the minimum

    sampling rate?

    What is the alias of 2000HZ wave frequency sampled at

    1500HZ?

    Signal to Noise Ratio (SNR)

  • 7/30/2019 Fundamentals of Multimedia Slide

    138/304

    The ratio of the power of the correct signal and the noise iscalled the signal to noise ratio (SNR) a measure of thequality of the signal.

    The SNR is usually measured in decibels (dB), where 1 dB isa tenth of a bel. The SNR value, in units of dB, is defined interms of base-10 logarithms of squared voltages, as follows:

    (6.2)

    2

    10 10210log 20log

    signal signal

    noise noise

    V V

    SNR V V

  • 7/30/2019 Fundamentals of Multimedia Slide

    139/304

    a) The power in a signal is proportional to the

    square of the voltage. For example, if thesignal voltage Vsignalis 10 times the noise,then the SNR is 20 log10(10) = 20dB.

    b) In terms of power, if the power from tenviolins is ten times that from one violinplaying, then the ratio of power is 10dB, or1B.

    c) To know: Power 10; Signal Voltage 20.

  • 7/30/2019 Fundamentals of Multimedia Slide

    140/304

    The usual levels of sound we hear around us are described in terms of decibels, as a ratio tothe quietest sound we are capable of hearing. Table 6.1 shows approximate levels for thesesounds.

    Table 6.1: Magnitude levels of common sounds, in decibels

    Threshold of hearing 0

    Rustle of leaves 10

    Very quiet room 20

    Average room 40

    Conversation 60

    Busy street 70

    Loud radio 80

    Train through station 90

    Riveter 100

    Threshold of discomfort 120

    Threshold of pain 140

    Damage to ear drum 160

    Signal to Quantization Noise Ratio(SQNR)

  • 7/30/2019 Fundamentals of Multimedia Slide

    141/304

    Aside from any noise that may have been present

    in the original analog signal, there is also anadditional error that results from quantization.

    (a) If voltages are actually in 0 to 1 but we have only 8

    bits in which to store values, then effectively we forceall continuous values of voltage into only 256 differentvalues.

    (b) This introduces a roundoff error. It is not reallynoise. Nevertheless it is called quantization noise(or quantization error).

  • 7/30/2019 Fundamentals of Multimedia Slide

    142/304

    The quality of the quantization is

    characterized by the Signal toQuantization Noise Ratio (SQNR).

    (a) Quantization noise: the difference between

    the actual value of the analog signal, for theparticular sampling time, and the nearestquantization interval value.

    (b) At most, this error can be as much ashalf of the interval.

    (c) For a quantization accuracy of N bits per sample the

  • 7/30/2019 Fundamentals of Multimedia Slide

    143/304

    (c) For a quantization accuracy ofNbits per sample, theSQNR can be simply expressed:

    (6.3)

    Notes:

    (a)We map the maximum signal to 2N1 1 ( 2N1) and the

    most negative signal to 2N1.

    (b) Eq. (6.3) is the Peaksignal-to-noise ratio, PSQNR: peaksignal and peak noise.

    1220log 20log

    10 10 1_

    2

    20 log 2 6.02 (dB)

    V Nsignal

    SQNRV

    quan noise

    N N

    ( ) Th d i i th ti f i t

  • 7/30/2019 Fundamentals of Multimedia Slide

    144/304

    (c) The dynamic range is the ratio of maximum tominimum absolute values of the signal: Vmax/Vmin. The

    max abs. value Vmax gets mapped to 2N1 1; the minabs. value Vmin gets mapped to 1. Vmin is the smallestpositive voltage that is not masked by noise. Themost negative signal, Vmax, is mapped to 2

    N1.

    (d) The quantization interval isV=(2Vmax)/2N, since

    there are 2N intervals. The whole range Vmax down to(Vmax V/2) is mapped to 2

    N1 1.

    (e) The maximum noise, in terms of actual voltages, ishalf the quantization interval:V/2 = Vmax/2N.

    6 02N i th t If th i t

  • 7/30/2019 Fundamentals of Multimedia Slide

    145/304

    6.02N is the worst case. If the input

    signal is sinusoidal, the quantization erroris statistically independent, and itsmagnitude is uniformly distributed between0 and half of the interval, then it can beshown that the expression for the SQNRbecomes:

    SQNR = 6.02N+1.76(dB) (6.4)

    Linear and Non-linear

    Q ti ti

  • 7/30/2019 Fundamentals of Multimedia Slide

    146/304

    Quantization Linear format: samples are typically stored as uniformly quantized values.

    Non-uniform quantization: set up more finely-spaced levels wherehumans hear with the most acuity.

    Webers Law stated formally says that equally perceived differences have valuesproportional to absolute levels:

    Response Stimulus/Stimulus (6.5)

    Inserting a constant of proportionality k, we have a differential equation that states:

    dr= k(1/s) ds (6.6)

    with response rand stimuluss.

    Integrating we arrive at a solution

  • 7/30/2019 Fundamentals of Multimedia Slide

    147/304

    Integrating, we arrive at a solution

    r= klns + C (6.7)

    with constant of integration C. Stated differently, the solution is

    r= kln(s/s0) (6.8)

    s0

    = the lowest level of stimulus that causes a response (r= 0 whens =s0

    ).

    Nonlinear quantization works by first transforming an analog signal from the raws spaceinto the theoretical rspace, and then uniformly quantizing the resulting values.

    Such a law for audio is called-law encoding, (oru-law). A very similar rule, called A-law, is used in telephony in Europe.

    The equations for these very similar encodings are as follows:

    -law:

  • 7/30/2019 Fundamentals of Multimedia Slide

    148/304

    law:

    (6.9)

    A-law:

    (6.10)

    Fig. 6.6 shows these curves. The parameteris set to= 100 or= 255; the parameterA for theA-law encoder is usually set toA =87.6.

    sgn( )ln 1 , 1

    ln(1 ) p p

    s s sr

    s s

    1,

    1 ln

    sgn( ) 11 ln , 1

    1 ln

    p p

    p p

    A s s

    A s s A

    r

    s s sA

    A s A s

    1 if 0,where sgn( )

    1 otherwise

    ss

  • 7/30/2019 Fundamentals of Multimedia Slide

    149/304

    Fig. 6.6: Nonlinear transform for audio signals The-law in audio is used to develop a nonuniform

    quantization rule for sound: uniform quantization ofrgivesfiner resolution ins at the quiet end.

    149Li & Drew

    Audio Filtering

  • 7/30/2019 Fundamentals of Multimedia Slide

    150/304

    Prior to sampling and AD conversion, the audio signal is also usuallyfilteredto remove unwanted frequencies. The frequencies kept depend onthe application:

    (a) For speech, typically from 50Hz to 10kHz is retained, and other frequencies are blocked by the use of a band-pass filterthat screens out lower and higher

    frequencies.

    (b) An audio music signal will typically contain from about 20Hz up to 20kHz.

    (c) At the DA converter end, high frequencies may reappear in the output because of sampling and then quantization, smooth input signal is replaced by aseries of step functions containing all possible frequencies.

    (d) So at the decoder side, a lowpass filter is used after the DA circuit.

    Audio Quality vs. Data Rate

  • 7/30/2019 Fundamentals of Multimedia Slide

    151/304

    The uncompressed data rate increases as more bits areused for quantization. Stereo: double the bandwidth. totransmit a digital audio signal.

    Table 6.2: Data rate and bandwidth in sample audio applications

    Quality Sample

    Rate (Khz)

    Bits per

    Sample

    Mono /

    Stereo

    Data Rate

    (uncompressed

    ) (kB/sec)

    Frequency

    Band (KHz)

    Telephone 8 8 Mono 8 0.200-3.4

    AM Radio 11.025 8 Mono 11.0 0.1-5.5

    FM Radio 22.05 16 Stereo 88.2 0.02-11

    CD 44.1 16 Stereo 176.4 0.005-20

    DAT 48 16 Stereo 192.0 0.005-20

    DVDAudio

    192 (max) 24(max) 6channels

    1,200 (max) 0-96 (max)

    Synthetic Sounds

  • 7/30/2019 Fundamentals of Multimedia Slide

    152/304

    1. FM (Frequency Modulation): oneapproach to generating synthetic sound:

    (6.11)( ) ( ) cos[ ( ) cos( ) ]c m m cx t A t t I t t

  • 7/30/2019 Fundamentals of Multimedia Slide

    153/304

    Fig. 6.7: Frequency Modulation. (a): A single frequency. (b): Twice thefrequency. (c): Usually, FM is carried out using a sinusoid argument toa sinusoid. (d): A more complex form arises from a carrier frequency,2tand a modulating frequency 4tcosine inside the sinusoid.

    2. Wave Table synthesis:

  • 7/30/2019 Fundamentals of Multimedia Slide

    154/304

    A more accurate way of generating

    sounds from digital signals. Also known,simply, as sampling.

    In this technique, the actual digital

    samples of sounds from real instrumentsare stored. Since wave tables are stored inmemory on the sound card, they can be

    manipulated by software so that soundscan be combined, edited, and enhanced.

    Quantization and Transmission of Audio

  • 7/30/2019 Fundamentals of Multimedia Slide

    155/304

    Coding of Audio: Quantization andtransformation of data are collectively known ascoding of the data.

    a) For audio, the-law technique for companding

    audio signals is usually combined with an algorithmthat exploits the temporal redundancy present inaudio signals.

    b) Differences in signals between the present and a

    past time can reduce the size of signal values andalso concentrate the histogram of pixel values(differences, now) into a much smaller range.

  • 7/30/2019 Fundamentals of Multimedia Slide

    156/304

    c) The result of reducing the variance of

    values is that lossless compression methodsproduce a bitstream with shorter bit lengthsfor more likely values

    In general, producing quantized sampledoutput for audio is called PCM (PulseCode Modulation). The differences versionis called DPCM (and a crude but efficientvariant is called DM). The adaptive versionis called ADPCM.

    Pulse Code Modulation

  • 7/30/2019 Fundamentals of Multimedia Slide

    157/304

    The basic techniques for creating digitalsignals from analog signals are samplingand quantization.

    Quantization consists of selectingbreakpoints in magnitude, and then re-mapping any value within an interval to

    one of the representative output levels.

  • 7/30/2019 Fundamentals of Multimedia Slide

    158/304

    Fig. 6.2: Sampling and Quantization.

    (a) (b)

    a) The set of interval boundaries are called

  • 7/30/2019 Fundamentals of Multimedia Slide

    159/304

    a) The set of interval boundaries are calleddecision boundaries, and the representative

    values are called reconstruction levels.

    b) The boundaries for quantizer input intervalsthat will all be mapped into the same output level

    form a coder mapping.

    c) The representative values that are the outputvalues from a quantizer are a decoder mapping.

    d) Finally, we may wish to compress the data, byassigning a bit stream that uses fewer bits for themost prevalent signal values (Chap. 7).

    Every compression scheme has three stages:

  • 7/30/2019 Fundamentals of Multimedia Slide

    160/304

    Every compression scheme has three stages:A. The input data is transformed to a new

    representation that is easier or more efficient tocompress.

    B. We may introduce loss of information.

    Quantization is the main lossy step we use alimited number of reconstruction levels, fewerthan in the original signal.

    C. Coding. Assign a codeword (thus forming abinary bitstream) to each output level or symbol.This could be a fixed-length code, or a variablelength code such as Huffman coding

    For audio signals we first consider PCM

  • 7/30/2019 Fundamentals of Multimedia Slide

    161/304

    For audio signals, we first consider PCM

    for digitization. This leads to LosslessPredictive Coding as well as the DPCMscheme; both methods use differentialcoding. As well, we look at the adaptiveversion, ADPCM, which can provide bettercompression.

    PCM in Speech Compression

  • 7/30/2019 Fundamentals of Multimedia Slide

    162/304

    Assuming a bandwidth for speech from about 50 Hz to about 10kHz, the Nyquist rate would dictate a sampling rate of 20 kHz.

    (a) Using uniform quantization without companding, the minimumsample size we could get away with would likely be about 12 bits.Hence for mono speech transmission the bit-rate would be 240 kbps.

    (b) With companding, we can reduce the sample size down to about 8bits with the same perceived level of quality, and thus reduce the bit-rateto 160 kbps.

    (c) However, the standard approach to telephony in fact assumes thatthe highest-frequency audio signal we want to reproduce is only about 4kHz. Therefore the sampling rate is only 8 kHz, and the companded bit-rate thus reduces this to 64 kbps.

    However there are two small wrinkles we must

  • 7/30/2019 Fundamentals of Multimedia Slide

    163/304

    However, there are two small wrinkles we mustalso address:

    1. Since only sounds up to 4 kHz are to be considered,all other frequency content must be noise.Therefore, we should remove this high-frequency

    content from the analog input signal. This is doneusing a band-limiting filter that blocks out high, aswell as very low, frequencies.

    Also, once we arrive at a pulse signal, such

    as that in Fig. 6.13(a) below, we must still performDA conversion and then construct a final outputanalog signal. But, effectively, the signal we arriveat is the staircase shown in Fig. 6.13(b).

  • 7/30/2019 Fundamentals of Multimedia Slide

    164/304

    Fig. 6.13: Pulse Code Modulation (PCM). (a) Original analogsignal and its corresponding PCM signals. (b) Decoded staircasesignal. (c) Reconstructed signal after low-pass filtering.

    164Li & Drew

    2. A discontinuous signal contains not just

  • 7/30/2019 Fundamentals of Multimedia Slide

    165/304

    2. A discontinuous signal contains not justgrequency components due to the originalsignal, but also a theoretically infinite set ofhigher-frequency components:

    (a) This result is from the theory ofFourieranalysis, in signal processing.

    (b) These higher frequencies are extraneous.

    (c) Therefore the output of the digital-to-analogconverter goes to a low-pass filterthat allowsonly frequencies up to the original maximum to beretained.

    The complete scheme for encoding and decoding

  • 7/30/2019 Fundamentals of Multimedia Slide

    166/304

    The complete scheme for encoding and decodingtelephony signals is shown as a schematic in Fig.6.14. As a result of the low-pass filtering, the outputbecomes smoothed and Fig. 6.13(c) above showedthis effect.

    Fig. 6.14: PCM signal encoding and decoding.

    Differential Coding of Audio

  • 7/30/2019 Fundamentals of Multimedia Slide

    167/304

    Audio is often stored not in simple PCM butinstead in a form that exploits differences which are generally smaller numbers, so offerthe possibility of using fewer bits to store.

    (a) If a time-dependent signal has someconsistency over time (temporal redundancy),the difference signal, subtracting the current

    sample from the previous one, will have a morepeaked histogram, with a maximum around zero.

  • 7/30/2019 Fundamentals of Multimedia Slide

    168/304

    Fundamental Concepts in

    Video

    Digital Video One may be excused for thinking that the capture

  • 7/30/2019 Fundamentals of Multimedia Slide

    169/304

    One may be excused for thinking that the capture

    and playback of digital video is simply a matter ofcapturing each frame, or image, and playing themback in a sequence at 25 frames per second.

    A single image or frame with a window size orscreen resolution of 640 x 480 pixels and 24 bitcolour (16.8 million colours) occupiesapproximately 1MB of disc space.

    Roughly 25 MB of disc space are needed forevery second of video, 1.5 GB for every minute.

    The three basic problems of digital video

    There are three basic problems with digital video

  • 7/30/2019 Fundamentals of Multimedia Slide

    170/304

    p g

    Size of video window, Frame rate and Quality of image

    Size of video window Digital video stores a lot of information about each pixel in each

    frame

    It takes time to display those pixels on your computer screen

    If the window size is small, then the time taken to draw the pixels isless. If the window size is large, there may not be enough time todisplay the image or single frame before its time to start the nextone

    Choose an appropriate window size, may not always produce

    desirable result

    Frame Rates Too many pixels and not enough time.

    Depending on the size of video window chosen, you may also be

    able to reduce file size by reducing the number of frames per

    5.1 Types of Video Signals

  • 7/30/2019 Fundamentals of Multimedia Slide

    171/304

    Component video

    Component video: Higher-end video systems make use of threeseparate video signals for the red, green, and blue image planes.Each color channel is sent as a separate video signal.

    (a) Most computer systems use Component Video, with separate signalsfor R, G, and B signals.

    (b) For any color separation scheme, Component Video gives the bestcolor reproduction since there is no crosstalk between the threechannels.

    (c) This is not the case for S-Video or Composite Video, discussed next.Component video, however, requires more bandwidth and goodsynchronization of the three components.

    Composite Video 1 Signal

  • 7/30/2019 Fundamentals of Multimedia Slide

    172/304

    Composite video: color (chrominance) and intensity (luminance) signals

    are mixed into a single carrier wave.

    a) Chrominance is a composition of two color components (I and Q, or U and V).

    b) In NTSC TV, e.g., I and Q are combined into a chroma signal, and a color subcarrier isthen employed to put the chroma signal at the high-frequency end of the signalshared with the luminance signal.

    c) The chrominance and luminance components can be separated at the receiver endand then the two color components can be further recovered.

    d) When connecting to TVs or VCRs, Composite Video uses only one wire and videocolor signals are mixed, not sent separately. The audio and syncsignals areadditions to this one signal.

    Since color and intensity are wrapped into the same signal, some interferencebetween the luminance and chrominance signals is inevitable.

    S-Video 2 Signals

  • 7/30/2019 Fundamentals of Multimedia Slide

    173/304

    S-Video: as a compromise, (separated video, or Super-video, e.g., in

    S-VHS) uses two wires, one for luminance and another for acomposite chrominance signal.

    As a result, there is less crosstalk between the colorinformation andthe crucial gray-scale information.

    The reason for placing luminance into its own part ofthe signal is thatblack-and-white information is most crucial for visual perception.

    In fact, humans are able to differentiate spatial resolution in grayscaleimages with a much higher acuity than for the color part of color images.

    As a result, we can send less accurate color information than must besent for intensity information we can only see fairly large blobs ofcolor, so it makes sense to send less color detail.

    5.2 Analog Video

  • 7/30/2019 Fundamentals of Multimedia Slide

    174/304

    An analog signal f(t) samples a time-varying image. So-called

    progressive scanning traces through a complete picture (a frame)row-wise for each time interval.

    In TV, and in some monitors and multimedia standards as well,another system, called interlaced scanning is used:

    a) The odd-numbered lines are traced first, and then the even-numberedlines are traced. This results in odd and even fields two fieldsmake up one frame.

    b) In fact, the odd lines (starting from 1) end up at the middle of a line

    at the end of the odd field, and the even scan starts at a half-way point.

  • 7/30/2019 Fundamentals of Multimedia Slide

    175/304

    Fig. 5.1: Interlaced raster scan

    c) Figure 5.1 shows the scheme used. First the solid (odd) lines are traced, P to Q, then R to S,etc., ending at T; then the even field starts at U and ends at V.

    d) The jump from Q to R, etc. in Figure 5.1 is called the horizontal retrace, during which theelectronic beam in the CRT is blanked. The jump from T to U or V to P is called thevertical retrace.

    Because of interlacing, the odd and even

  • 7/30/2019 Fundamentals of Multimedia Slide

    176/304

    g,

    lines are displaced in time from each other generally not noticeable except whenvery fast action is taking place on screen,when blurring may occur.

    For example, in the video in Fig. 5.2, themoving helicopter is blurred more than isthe still background.

  • 7/30/2019 Fundamentals of Multimedia Slide

    177/304

    Fig. 5.2: Interlaced scan produces two fields for each frame. (a) Thevideo frame, (b) Field 1, (c) Field 2, (d) Difference of Fields

    (a)

    (b) (c) (d)

    Since it is sometimes necessary to change the frame rate,resize, or even produce stills from an interlaced source video,various schemes are used to de-interlace it

  • 7/30/2019 Fundamentals of Multimedia Slide

    178/304

    various schemes are used to de interlace it.

    a) The simplest de-interlacing method consists of discarding onefield and duplicating the scan lines of the other field. Theinformation in one field is lost completely using this simpletechnique.

    b) Other more complicated methods that retain information fromboth fields are also possible.

    Analog video use a small voltage offset from zero to indicateblack, and another value such as zero to indicate the start ofa line. For example, we could use a blacker-than-black zero

    signal to indicate the beginning of a line.

  • 7/30/2019 Fundamentals of Multimedia Slide

    179/304

    Fig. 5.3 Electronic signal for one NTSC scan line.

    Digital Video

  • 7/30/2019 Fundamentals of Multimedia Slide

    180/304

    The advantages of digital representation for video aremany. For example:(a) Video can be stored on digital devices or in memory,

    ready to be processed (noise removal, cut and paste, etc.),and integrated to various multimedia applications;

    (b) Direct access is possible, which makes nonlinear videoediting achievable as a simple, rather than a complex,task;

    (c) Repeated recording does not degrade image quality;

    (d) Ease of encryption and better tolerance to channel noise.

    Chroma Subsampling

  • 7/30/2019 Fundamentals of Multimedia Slide

    181/304

    Since humans see color with much less spatialresolution than they see black and white, it makessense to decimate the chrominance signal.

    Interesting (but not necessarily informative!) nameshave arisen to label the different schemes used.

    To begin with, numbers are given stating how manypixel values, per four original pixels, are actuallysent:

    (a) The chroma subsampling scheme 4:4:4 indicatesthat no chroma subsampling is used: each pixels Y,Cb and Cr values are transmitted, 4 for each of Y, Cb,Cr.

    (b) The scheme 4:2:2 indicates horizontal subsampling ofthe Cb Cr signals by a factor of 2 That is of four pixels

  • 7/30/2019 Fundamentals of Multimedia Slide

    182/304

    the Cb, Cr signals by a factor of 2. That is, of four pixelshorizontally labelled as 0 to 3, all four Ys are sent, andevery two Cbs and two Crs are sent, as (Cb0, Y0)(Cr0,Y1)(Cb2, Y2)(Cr2, Y3)(Cb4, Y4), and so on (or averagingis used).

    (c) The scheme 4:1:1 subsamples horizontallyby a factor

    of 4.

    (d) The scheme 4:2:0 subsamples in both the horizontaland verticaldimensions by a factor of 2. Theoretically, anaverage chroma pixel is positioned between the rows andcolumns as shown Fig.5.6.

    Scheme 4:2:0 along with other schemes is commonly usedin JPEG and MPEG (see later chapters in Part 2).

  • 7/30/2019 Fundamentals of Multimedia Slide

    183/304

    Fig. 5.6: Chroma subsampling

  • 7/30/2019 Fundamentals of Multimedia Slide

    184/304

    Lossless Compression

    Algorithms

    Introduction

  • 7/30/2019 Fundamentals of Multimedia Slide

    185/304

    Compression: the process of coding that willeffectively reduce the total number of bits neededto represent certain information.

    Fig. 7.1: A General Data Compression Scheme.

    Introduction (contd)

  • 7/30/2019 Fundamentals of Multimedia Slide

    186/304

    If the compression and decompression processesinduce no information loss, then the compressionscheme is lossless; otherwise, it is lossy.

    Compression ratio:

    (7.1)

    B0 number of bits before compressionB1 number of bits after compression

    0

    1

    B

    compressionratioB

    Compression basically employsredundancy in the data:

  • 7/30/2019 Fundamentals of Multimedia Slide

    187/304

    Temporal -- in 1D data, 1D signals, Audio etc. Spatial -- correlation between neighbouring pixels or

    data items

    Spectral -- correlation between colour or luminescence

    components. This uses the frequency domain to exploitrelationships between frequency of change in data.

    psycho-visual -- exploit perceptual properties of thehuman visual system.

    Basics of Information Theory

  • 7/30/2019 Fundamentals of Multimedia Slide

    188/304

    The entropy of an information source with alphabet S= {s1,s2,

    . . . ,sn} is:

    (7.2)

    (7.3)

    pi probability that symbolsi will occur in S.

    indicates the amount of information ( self-informationas defined by Shannon) contained insi, which corresponds tothe number of bits needed to encodesi.

    2

    1

    1( ) log

    n

    i

    i i

    H S pp

    2

    1

    logn

    i i

    i

    p p

    1log

    2 pi

    Distribution of Gray-LevelIntensities

  • 7/30/2019 Fundamentals of Multimedia Slide

    189/304

    Fig. 7.2 Histograms for Two Gray-level Images.

    Fig. 7.2(a) shows the histogram of an image with uniform distributionof gray-level intensities, i.e., i pi= 1/256. Hence, the entropy of thisimage is:

    log2256 = 8 (7.4)

    Fig. 7.2(b) shows the histogram of an image with two possible values.Its entropy is 0.92.

    Entropy and Code Length

    As can be seen in Eq. (7.3): the entropy is ai ht d f t h it

  • 7/30/2019 Fundamentals of Multimedia Slide

    190/304

    weighted-sum of terms ; hence itrepresents the average amount of informationcontained per symbol in the source S.

    The entropy specifies the lower bound for the

    average number of bits to code each symbol in S,i.e.,

    (7.5)

    - the average length (measured in bits) of thecodewords produced by the encoder.

    1log2 pi

    l

    l

    Simple Repetition Suppression For Example

  • 7/30/2019 Fundamentals of Multimedia Slide

    191/304

    p

    89400000000000000000000000000000000 With 894f32

    Suppression of zero's in a file (Zero Leng th

    Suppress ion) Silence in audio data, Pauses in conversation

    Bitmaps

    Blanks in text or program source files

    Backgrounds in images

    Run-Length Coding

  • 7/30/2019 Fundamentals of Multimedia Slide

    192/304

    This encoding method is frequentlyapplied to images (or pixels in a scan line).

    For example:

    111122233333311112222 can beencoded as: (1,4),(2,3),(3,6),(1,4),(2,4)

    Variable-Length Coding (VLC)

    R i b l i h i bl bi

  • 7/30/2019 Fundamentals of Multimedia Slide

    193/304

    Representing symbols with variable bit

    Shannon-Fano Algorithm a top-down approach

    1. Sort the symbols according to the frequency count of theiroccurrences.

    2. Recursively divide the symbols into two parts, each withapproximately the same number of counts, until all parts containonly one symbol.

    An Example: coding of HELLO

    Frequency count of the symbols in HELLO.

    Symbol H E L O

    Count 1 1 2 1

  • 7/30/2019 Fundamentals of Multimedia Slide

    194/304

    Fig. 7.3: Coding Tree for HELLO by Shannon-Fano.

    Table 7.1: Result of Performing Shannon-Fanoon HELLO

  • 7/30/2019 Fundamentals of Multimedia Slide

    195/304

    Symbol Count Log2 Code # of bits used

    L 2 1.32 0 1

    H 1 2.32 10 2

    E 1 2.32 110 3

    O 1 2.32 111 3TOTAL # of bits: 10

    1

    pi

  • 7/30/2019 Fundamentals of Multimedia Slide

    196/304

    Fig. 7.4 Another coding tree for HELLO by Shannon-Fano.

    Table 7.2: Another Result of Performing Shannon-Fano

    on HELLO (see Fig. 7.4)

  • 7/30/2019 Fundamentals of Multimedia Slide

    197/304

    Symbol Count Log2 Code # of bits used

    L 2 1.32 00 4

    H