multimedia - study.riazulislam.comstudy.riazulislam.com/uploads/3/9/8/5/3985970/slides_14.pdf ·...

Post on 13-Mar-2018

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Multimedia

Content-based Multimedia Retrieval

Course Code 005636 (Fall 2017)

Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea

E-mail: riaz@sejong.ac.kr

Contents

Overview of Content-based multimedia retrieval

Concepts of Content-based image retrieval

Audio Retrieval

Document Image Analysis and Retrieval

System Architecture

Content-based Image Retrieval (CBIR)

Searching for digital images in large databases

What kinds of databases?

What kinds of queries?

What constitutes a match?

How do we make such searches efficient?

Deep blue sky

Orange sunset

CBIR Applications

Art Collections

e.g. Fine Arts Museum of San Francisco

Medical Image Databases

CT, MRI, Ultrasound, The Visible Human

Scientific Databases

Earth Sciences

General Image Collections for Licensing

The World Wide Web

What is a Query?

An image you already have

A rough sketch you draw

S symbolic description of what you want

CBIR System

Offline Processing

Image Features / Distance Measures

Image Database

Query Image

Distance Measure

Retrieved Images

Image Feature

User

Feature Space

Images

Features

Color (histograms, gridded layout, wavelets)

Texture (Laws, Gabor filters, LBP, polarity)An entity consisting of mutually related pixels and group of pixels

Shape (What preprocessing must occur to get shape?)

Objects and their Relationships This is the most powerful, but you have to be able to recognize the

objects!

Artificial texture Natural texture

Research Objective

Image Database

Query Image Retrieved Images

ImagesObject-oriented

Feature Extraction

User

Animals

Buildings

Office Buildings

Houses

Transportation

•Boats

•Vehicles

boat

Categories

A Taxonomy of Audio

Sound

Music Other?Speech

Classical

Country

Disco Hip Hop

Jazz

RockSports

AnnouncerFemale

Male

Orchestra

String

Quartet

Choir

Piano

?

Acoustic Modeling

Describes the sounds that

make up speech

Lexicon

Describes which

sequences of speech

sounds make up

valid words

Language Model

Describes the likelihood

of various sequences of

words being spoken

Speech Recognition

Speech Recognition Knowledge Sources

Speech Recognition Process

Pronunciation

Lexicon

Signal Processing

Phonetic

Probability

Estimator

(Acoustic

Model)

Decoder

(Language

Model)WordsSpeech

Grammar

Document Image Analysis

Recognize text (OCR)

convert page images to Unicode

machine-printed, handwritten

Analyze page layout geometry

a 2-D problem (unlike speech, text)

good ‘language-free’ algorithms

Capture logical structure

output marked-up text (XML, etc)

exploit non-textual clues

Video/Image OCR Block Diagram

Text Area

Detection

Text Area

Preprocessing

Commercial

OCR

Video or

Image

UTF8 Text

Text Detection

System Architecture

• Combine video, audio and text retrieval scores

Query

Text Image Audio

Text Score Image Score Audio Score

Retrieval

Agents

Final Score

Q&A

top related