star challenge – multimedia search competition 2008

23
Star Challenge – multimedia search competition 2008 NUS.SIGIR group Luong Minh Thang & Zhao Jin WING group meeting – 12 Sep, 2008 06/20/22 1

Upload: werner

Post on 11-Jan-2016

24 views

Category:

Documents


2 download

DESCRIPTION

Star Challenge – multimedia search competition 2008. NUS.SIGIR group Luong Minh Thang & Zhao Jin WING group meeting – 12 Sep, 2008. Agenda. About StarChallenge Approaches Audio system Video system Results. Let’s start with a clip on Tai Chi!. The Star Challenge. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Star Challenge – multimedia search competition 2008

Star Challenge – multimedia search competition 2008

NUS.SIGIR groupLuong Minh Thang & Zhao Jin

WING group meeting – 12 Sep, 2008

04/21/23 1

Page 2: Star Challenge – multimedia search competition 2008

Agenda

• About StarChallenge• Approaches

– Audio system– Video system

• Results

04/21/23 2

Page 3: Star Challenge – multimedia search competition 2008

Let’s start with a clip on Tai Chi!

Page 4: Star Challenge – multimedia search competition 2008

The Star Challenge

• International Competition organized by Singapore A*STAR

• Focus on Multimedia Search by Voice and Video

• Prize: – Free Trip to Singapore (blah!) – USD 100,000 (!!!)

Page 5: Star Challenge – multimedia search competition 2008

The Tasks

• Voice Search– AT1: Search by IPA (International Phonetic Alphabet)– AT2: Search by Example– AT3: Search for recurrent voice segments

• Video Search– VT1: Search by (single) Query Image– VT2: Search by Video Shot– VT3: Scene/Event Categorization

AT3 and VT3 replaced by integrated search in the end

Page 6: Star Challenge – multimedia search competition 2008

Timeline

• Mar 31: Registration Deadline– Registered as adMIRer– 5 members from NUS-SIGIR – 56 teams registered in total

• June 18: 1st Knockout Round– AT1+AT2– 8 Teams qualified

Page 7: Star Challenge – multimedia search competition 2008

Timeline

• July 18: 2nd Knockout Round– VT1+VT2– 7 Teams qualified

• September 4: Qualifying Race– All four tasks with Integrated Search– Only 5 Teams would qualify

• October 23: Grand Final– On-site evaluation

Page 8: Star Challenge – multimedia search competition 2008

Audio system – general approach

• Use MFCC - well reflects speech• Use local alignment to align 2 sequences of

audio & query• Using spectrogram, we cut up long audio into

small segments for better matching. Short demo

04/21/23 8

Page 9: Star Challenge – multimedia search competition 2008

Audio system – system overview

04/21/23 9

Test audio files

Speech recognizer

Audio feature extractor

Query audio files

Query-test similarity matrix

Index dataQuery text

Query MFCC vectors

Lucene indexing

Test MFCC vectors

Test text

Alignment & matching Lucene matching

Results

Heuristic fusion

Page 10: Star Challenge – multimedia search competition 2008

Audio system – Handle IPA

• " i n t r ^ s t r ei t”: IPA query• Translate to CMU phonemes:

IH N T R AH S T R EY T

• INTEREST: IH N T R AH S T• RATE: R EY T• Query text: input to text

module directly synthezied to audio file for audio module

04/21/23 10

i n t r ^ s ei a: @ o:IH N T R AH S EY AA AE AO

au ai b tS d TH e e: ei auAW AY B CH D DH EH ER EY AW

Page 11: Star Challenge – multimedia search competition 2008

Audio system – overall performance

• Not have complete statistics yet, but AT2 (query by example) ~ 30-40% MAP, AT1 ~ 10 %

• Let’s listen to a few queries …

04/21/23 11

SQ017.wav SQ029.wavSQ023.wav SQ036.wav

Page 12: Star Challenge – multimedia search competition 2008

Video system – VT1 categories• 1. Crowd (>10 people) • 2. Building with sky as backdrop,

clearly visible • 3. Mobile devices including

handphone/PDA • 4. Flag • 5. Electronic chart, e.g. stock charts,

airport departure chart • 6. TV chart Overlay, including graphs,

text, powerpoint style • 7. Person using Computer, both visible

• 8. Track and field, sports • 9. Company Trademark, including

billboard, logo • 10. Badminton court,

04/21/23 12

• 11. Swimming pool, sports • 12. Closeup of hand, e.g. using

mouse, writing, etc • 13. Business meeting (> 2 people),

mostly seated down, table visible • 14. Natural scene, e.g. mountain,

trees, sea, no pple • 15. Food on dishes, plates • 16. Face closeup, occupying about

3/4 of screen, frontal or side • 17. Traffic Scene, many cars, trucks,

road visible • 18. Boat/Ship, over sea, lake • 19. PC Webpages, screen of PC

visible • 120. Airplane

Page 13: Star Challenge – multimedia search competition 2008

Video system - examples

04/21/23 13

16. Face closeup

2. Building with sky backdrop

9. Company trademark

3. Mobile devices

Page 14: Star Challenge – multimedia search competition 2008

Video system – VT2 categories• 1. People entering/exiting door/car • 2. Talking face with introductory caption • 3. Fingers typing on a keyboard • 4. Inside a moving vehicle, looking outside • 5. Large camera movement, tracking an

object, person, car, etc • 6. Static or minute camera movement,

people(s) walking, legs visible • 7. Large camera movement, panning

left/right, top/down of a scene • 8. Movie ending credit • 9. Woman monologue • 10. Sports celebratory hug

04/21/23 14

Page 15: Star Challenge – multimedia search competition 2008

Video system – general approach

04/21/23 15

classifiers

Classified cateogry

Test files

Category filteringQuery category

Filtered test files

MatchingQuery file

Matched test files

Page 16: Star Challenge – multimedia search competition 2008

Video system - Training data size

04/21/23 16

Category Size

1 15

2 47

3 33

4 6

5 5

6 18

7 30

8 3

9 23

10 3

Category Size

11 21

12 130

13 11

14 21

15 3

16 42

17 6

18 10

19 43

20 3

• Dev = 10% labelled data, Train = 90% labelled data• Size varies significantly across different categories

Development data statistics

Page 17: Star Challenge – multimedia search competition 2008

Train key frames + categories

Layout extractorEdge extractor Face detectorColor extractor

Color classifier Face classifierEdge classifier Layout classifier

Color histogram (HSV, RGB)

Segmentation info

Num faces, size, positionsEdge histogram

Dev key frames

Multi-class SVM training

Color recall /categories

Layout recall /categories

Facerecall /categories

Edge recall /categories

Video system – classifier training

Uses as weights

Page 18: Star Challenge – multimedia search competition 2008

04/21/23 18

face edge layout hsv rgb lab1 0.02

2 0.21 0.61

3 1 0.15

456 0.77 0.55 0.41 0.33

7 0.26 0.83

89 0.26 0.17 0.4 0.5

10 0.33 1

11 0.76 0.71 0.81 0.35

12 0.3 0.57

13 0.27 0.33

14 0.14 0.52 0.25 0.18

1516 0.23 0.16 0.11 0.14

1718 0.18

19 0.34 0.48 0.34 0.28

20

Classifer recall/categories

• Uses as weights when fusing all different classifier

• No miror analysis & n-fold testing yet

Page 19: Star Challenge – multimedia search competition 2008

Color histogram (HSV, RGB)

Segmentation info

Num faces, size, positionsEdge histogram

motion histogram; camera & object

motion

Test Key frames

Classifier merger (weights from dev data)

Color classifier Face classifierEdge classifier Layout classifier

Video system – Category filtering & Matching

Layout extractorEdge extractor Face detectorColor extractorMotion

extractor

Test video

Category filtering

Query category

Filtered key frames

Heuristic category filtering

Filtered video

Matching

Query video/frames

Results

Page 20: Star Challenge – multimedia search competition 2008

Video system – motion 1

04/21/23 20

Camera: panning left Camera: panning up

Object motion: moving Object motion: static

Page 21: Star Challenge – multimedia search competition 2008

Video system – motion 2

04/21/23 21

• Check if most vector ~ 0 static motion• Otherwise, filter all small motion vectors• Categories motion vectors into circle bins • histogram. + main vector motion• If main vector motion dominates camera

motion panning left, right, up, down• To detect zooming, find a focus block/point• Object motion is derived after removing

camera motion

Page 22: Star Challenge – multimedia search competition 2008

Conclusion

• We have built up a full-function system within a short time and in an ad-hoc manner

• There are plenty of place for performance improvement and detailed analysis.

04/21/23 22

Page 23: Star Challenge – multimedia search competition 2008

Q & A?

•Thank you !!!

04/21/23 23