audio based indexing and retrieval in muvis
Post on 06-Apr-2018
233 Views
Preview:
TRANSCRIPT
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 1/12
Audio-Based Multimedia
Indexing and Retrieval
Framework in MUVIS
System Overview & Applications
by Serkan KIRANYAZ.
Tampere Univ. of Tech.
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 2/12
MUVIS Overview
MBrowser
Query
Browsing
Video
Summarization
Display
DbsEditor
Encoding-Decoding-
Rendering
DatabaseCreation
FeX- AFeX
Management
AVDatabase
Capturing
Encoding
Recording
AV Database
Creation
Still Images
*.jpg
*.gif *.bmp
*.pct
*.pcx *.png
*.pgm
*.wmf
*.eps
*.tga
Real Time
Video-Audio
Stored MM
(Video-Audio)
An Image
A Frame
A Video-
Audio Clip
Image and MM files
Appending - Deleting
Appending into Dbs.
Image and MM
files - typesConvertions
Database
Editing
*.jp2
*.yuv
AV
Database
Image
Database
HybridDatabase
FeX ModulesAFeX
Modules
Fex & AFeX API
Indexing Retrieval
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 3/12
MUVIS Multimedia
44.1 KHz
32 KHzPCM
RGB 24MP424 KHzG723
YUV 4:2:0AVI22.050 KHzG721
MP4MPEG-4AACStereo12 & 16 KHzAAC
AVIAny1..25 fpsH263+MP3Mono8 & 11.025 KhzMP3
File FormatsFrameSizeFrame RateCodecsFile FormatsChannelSampling
Freq.Codecs
MUVIS VideoMUVIS Audio
PGMWMFEPSPCXTGAPCTGIFPCXPNGTIFFBMPJPEG 2KJPEGMUVIS Images
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 4/12
Audio-Based Multimedia Indexing
and Retrieval Framework for MUVIS
A global framework implementation in order to
achieve a robust and generic solution for audio-
based multimedia indexing and retrieval,
specifically: Generic Support for Audio Codecs
Generic Support for File Formats
Generic Support for Audio Capturing & Encoding Parameters
Generic Support for AFeX Framework Parameters
The main objective is content-based (speaker,
subject, “sounds like..”) retrieval of the audio, which
is suitable to human judgment and (aural)perception.
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 5/12
Audio Indexing Scheme in MUVIS
Silence MusicSpeech NotClassified
Audio Framing & Classification Conversion
Uncertain Speech Music NotClassified
AFeX Module
. . .. . .
5
7
3
010
201
29 6
15
Audio Indexing
Speech Music NotClassified
KF Feature Vectors
KF Extraction
via MST Clustering 4
AFeX Operation
per frame3
Audio Framing
in Valid Classes2
Classification & Segmentation per granule/frame.1
Audio Stream
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 6/12
2. Audio Framing with
Classification Conversion
M MMM MM M M SS S SSSSS SX XXXX
Music SpeechUncertain
Classification per granule/frame
Final Classification per audio frame
M: Music
S: Speech
X: Silence
Uncertain
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 7/12
Audio Feature Extraction
(AFeX) Framework Independent AFeX module(s) integration
capability into MUVIS framework for audio-based indexing and retrieval.
DBSEditor
MBrowser
AFex_API.h
AFex_Bind()
AFex_Init()
AFex_Extract()AFex_GetDistance()
AFex_Exit()
AFex_*.DLL
AFex_Bind
AFex_Init
AFex_ExtractAFex_Exit
AFex_GetDistance
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 8/12
Key-Framing via MST Clustering
S p ee ch L a b
9
8
1
1 1
213
1418
19
4
'a'
p
1
111
12
17
'L'
1
3168
21
'b'
8
7
69
0
101
1
20
'S'
1
29 6
2
1 2
1
15
'ch'
5
7
3
1
2
'ee'
21
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 9/12
A Sample AFeX Module Imp.: MFCC
MFCC (Mel-Frequency Cepstrum Coefficients) AFeX module provide generic feature vectorsindependent from the following parameters: Sampling Frequency. Number of audio channels (mono/stereo).
Audio Volume level.
−
⋅⋅=
∑= )5.0(coslog)/2( 1
2/1
j N
i
m P c
P
j ji
π
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 10/12
Audio Retrieval in MUVIS
FV(i)
FV(0)
Sub-feature Vectorsof a Database Clip
FV(i)
FV(0)
FV(i)
FV(0)
Sub-feature Vectorsof Query Clip
For each frame, a search is
done to find a matching frame
which gives minimum distance.
Matching Class Types
Feature vectors per
class type
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 11/12
Audio Retrieval in MUVIS (cont.)
In order to accomplish an audio based query within MUVIS, an audioclip is chosen in a multimedia database and queried through thedatabase if the database includes at least one audio feature.
Let NoS be the number of feature sets for a database and let NoF(s)is the number of sub-features per feature. Sub-features are obtainedby changing the AFeX module parameters or the audio frame sizeduring the audio feature extraction process.
( )( )
( )
),(),(
,),(
0
),(),,(min),(
)(
f s D f sW QD
f s D f s D
C jif
C jif f s DFV f sQFV SD f s D
NoS
s
s NoF
f c
C i
i
i
i
iC j
C
j
C
i
i
i
i
ii
∑ ∑
∑
×=
=
∅=∈
∅≠∈=
∈
∈
8/3/2019 Audio Based Indexing and Retrieval in MUVIS
http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 12/12
Conclusions & Remarks
Audio is important. Sometimes it bears moresemantic and content information than video.
Henceforth the preliminary results shows theeffectiveness of the audio-based retrieval compared
to visual retrievals (similar or better results). Classification and segmentation algorithm has been
recently improved. A new approach based on fuzzy-
regions and semantic-rule-based classification withintra segment boundary detection has beendeveloped.
top related