predictive video retrieval · 2008-12-17 · come see our interactive search demo 0 0.05 0.10 0.15...
TRANSCRIPT
Predictive Video RetrievalA Matter of Trust
Bouke Huurnink
MediaMill
The TeamBouke Huurnink Jiyin He Koen van de SandeOrk de RooijCees SnoekMaarten de RijkeJan van GemertJasper UijlingsXirong LiIvo EvertsVladimir Nedovic
Michiel van LiemptRichard van Balen
FeiYanMuhammad Tahir
Krystian MikolajczykJosef Kittler
Jan-Mark GeusebroekTheo Gevers
Marcel WorringArnold Smeulders
Dennis Koelma
Come see our interactive search demo
0
0.05
0.10
0.15
0.20
0
0.05
0.10
0.15
0.20
Now with (inter)active learning!Presented by Ork de Rooij
UvA
Why predictive video retrieval?
• Video retrieval is multichannel problem:• Speech• Detectors• Examples
• Observations• Speech works for named entity topics• Detectors work when closely related to topic• Examples can also work pretty well
• We want to exploit this knowledge
Idea
• Predict which type of search - retrieval channel - we can trust for a topic
• Rerank results from this channel with secondary result information
Outline
• System description
• Result overview
• Analysis
• Conclusion
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Distribute ASR and MT over shot neighbourhood, then retrieval using language modelling approach
Pseudo active-learning, with positive examples from topic and 100 random negative examples from collection
Content based selection from 57 learned concepts, followed by unweighted score-based fusion
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Named entity? Trust speech resultsDetector match? Trust detector resultsElse...trust example results
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example results
Our predictive system
Retrieval Channels
Speech Search
Detector Search
Example Search
Predict Trusted Channel
Reranking
Final Results
Information NeedFind shots of pieces
of paper with writing, typing, or printing,
filling more than half of the frame area.
Result Lists
Trusted Results
- Detector results
Secondary Results
- Speech results
Secondary Results
- Example resultsTruncate result lists to top 1000
Eliminate all results not in trusted listCombine results with (weighted) Borda fusion
Query-class vs Prediction
Query-class Prediction
Query class determines retrieval strategy
Query features determine retrieval strategy
Focus on assigning query-class dependent weights
Focus on identifying trusted retrieval channel
Runs
• Speech channel only UvA-MM-6
• Detector channel only UvA-MM-5
• Example channel only supplementary
• Predictive reranking UvA-MM-4
• Predictive weighted reranking UvA-MM-3
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Overall Automatic Search Performance
Predictive reranking
Detector channel Example channel Speech channel
Predictive weighted reranking
mea
n in
ferr
ed a
vera
ge p
reci
sion
All runs
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Overall Automatic Search Performance
Predictive reranking
Detector channel Example channel Speech channel
Predictive weighted reranking
Predictive reranking outperforms individual channels
mea
n in
ferr
ed a
vera
ge p
reci
sion
All runs
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Overall Automatic Search Performance
Predictive reranking
Detector channel Example channel Speech channel
Predictive weighted reranking
Predictive reranking outperforms individual channels
Weighting did not have big influence
mea
n in
ferr
ed a
vera
ge p
reci
sion
All runs
General findings
• 20 topics > 0.05 inferred average precision
• 1 speech topic
• 11 detector topics
• 8 example topics
• Accurately predicted 15 of 20 topics
A closer look
person opening doora bridge
people with trees and plantsface filling over half the frame
paper with writingpeople with a body of water
a mapvehicle moving away
people looking in microscopeperson watching television
people in a kitchena crowd of people outdoors
a classroom scenean airplane exterior
a plant that is the main objecta street scene at night
people at table with computerpeople in white lab coats
ships or boats in the waterman talking to camera indoors
inferred average precision0.1 0.2 0.3 0.4 0.50
Predictive w. rerankingDetector channelExample channelSpeech channel
A closer look
person opening doora bridge
people with trees and plantsface filling over half the frame
paper with writingpeople with a body of water
a mapvehicle moving away
people looking in microscopeperson watching television
people in a kitchena crowd of people outdoors
a classroom scenean airplane exterior
a plant that is the main objecta street scene at night
people at table with computerpeople in white lab coats
ships or boats in the waterman talking to camera indoors
A lot of variance between channels
inferred average precision0.1 0.2 0.3 0.4 0.50
Predictive w. rerankingDetector channelExample channelSpeech channel
person opening doora bridge
people with trees and plants
paper with writing
a map
people looking in microscope
people in a kitchena crowd of people outdoors
a classroom scenean airplane exterior
a plant that is the main objecta street scene at night
people at table with computerpeople in white lab coats
man talking to camera indoors
When prediction worked
Only trusted channel and reranked performance
shown
Predictive w. rerankingDetector channelExample channelSpeech channel
inferred average precision0.1 0.2 0.3 0.4 0.50
person opening doora bridge
people with trees and plants
paper with writing
a map
people looking in microscope
people in a kitchena crowd of people outdoors
a classroom scenean airplane exterior
a plant that is the main objecta street scene at night
people at table with computerpeople in white lab coats
man talking to camera indoors
When prediction worked
Only trusted channel and reranked performance
shown
Predictive reranking often close to or better than trusted channel
Predictive w. rerankingDetector channelExample channelSpeech channel
inferred average precision0.1 0.2 0.3 0.4 0.50
When prediction didn’t work
face filling over half the frame
people with a body of water
vehicle moving away
person watching television
ships or boats in the waterman talking to camera indoors
Predictive w. rerankingDetector channelExample channelSpeech channel
inferred average precision0.1 0.2 0.3 0.4 0.50
Only trusted channel and reranked performance
shown
When prediction didn’t work
face filling over half the frame
people with a body of water
vehicle moving away
person watching television
ships or boats in the waterman talking to camera indoors
Predictive w. rerankingDetector channelExample channelSpeech channel
Predictive reranking boosts trusted channel results
inferred average precision0.1 0.2 0.3 0.4 0.50
Only trusted channel and reranked performance
shown
Conclusions
Conclusions
Predictive retrieval works, even with simple reranking
Conclusions
Predictive retrieval works, even with simple reranking
Incorrect predictions have limited impact
Conclusions
Predictive retrieval works, even with simple reranking
Incorrect predictions have limited impact
Good ingredients are crucial:garbage in garbage out!
Beeld en Geluid Searches
20 20 uur 20 uur journaal aartsen afghanistan ajax algemene beschouwingen amsterdam
andere tijden avondjournaal balkenende beatrix buitenhof bush close up
de wereld draait door debat eenvandaag evn feyenoord gemeenteraadsverkiezingen
goedemorgen nederland hirsi ali holland sport holleeder internationale nieuwsuitwisseling irak
iran jeugdjournaal journaal journaal 20 kassa klokhuis
koefnoen koninginnedag kooten kopspijkers kro kruispunt langs de lijn libanon lijst 0 lingo man bijt hond
max catherine maxima mens milosevic miniatuur moszkowicz nederland kiest netwerk nieuwslicht
nioscoop nos nos journaal nova nps arena opsporing verzocht paul de leeuw pauw pauw en
witteman pauw witteman pechtold politie polygoon radar rembrandt rouvoet rutte saddam schepper
co schepper en co schipholbrand sesamstraat sonja spiritus sporen uit het oosten sport sportjournaal
studio sport tegenlicht televisie tros tv show twee vandaag uruzgan vandaag
verdonk verkiezingen voetbal vragenuur vragenuurtje vroege vogels wereld draait door wilders wouter bos
zembla zoekt en gij zult vinden zomergasten
Pondering
• What if we had more variety in query types?
General object queries
Named entity queries