cuni at mediaeval 2014 search and hyperlinking task: search task experiments

CUNI at MediaEval 2014Search and Hyperlinking Task

Petra Galuščáková, Martin Kruliš, Jakub Lokoč and Pavel [email protected]

Faculty of Mathematics and PhysicsCharles University in Prague

MediaEval, 17. 10. 2014

2

System Description

● The recordings are segmented into overlapping passages

● Segments are indexed using the Terrier IR Platform● Language model● Predefined settings, stopwords removal, Porter

stemmer

3

System Description Cont.

● Concatenation of the segment and corresponding metadata● Title, episode title, description, short episode synopsis,

service name and program variant● Results filtering

4

Segmentation

● Fixed-length segmentation● 60-seconds long segments, 10-seconds long shift

● Decision Trees (DT)● For each word in the transcript/subtitles, it decides

whether the segment ends after this word or not● Model trained on manually marked segments in Similar

Segments in Social Speech Task MediaEval 2013● Cue word n-grams and cue tag n-grams, silence between

words, division given in transcripts, letter cases, and the output of the TextTiling algorithm

● 50- and 120-seconds long segments

5

Search Results

● Highest results achieved on the subtitles● Generally: LIMSI > NST-Sheffield > LIUM● Metadata improve the results● Fixed-length segmentation outperforms the

DT-based segmentation with 120-seconds-long segments

● 50-seconds-long DT segments outperform the fixed-length segments

6

Search Results - Overlapping

● All measures, except the MAP-tol measure, are notably higher in the experiments in which we did not remove partially overlapping segments

● These measures do not distinguish, whether a user had already seen the retrieved segment● The overlapping segments are expected not to be so

beneficial for the users● MAP-tol measure is not influenced by this behavior

7

Hyperlinking

● Transformed to the Search Task:Query segment was transformed into a textual query by including all the words of the subtitles lying within the segment boundary

● Context: we used a 200-seconds-long passage before and after each segment

● Metadata: we concatenated corresponding metadata with each segment and each query segment

● We combined textual similarity with visual and prosodic similarity

8

Visual Similarity

9

Visual Similarity Cont.

● We use Feature Signatures and Signature Quadratic Form Distance

http://siret.ms.mff.cuni.cz

10

Audio Similarity

11

Hyperlinking Results

● Metadata help, transcripts and post-filtering behave similarly as in the Search Task

● Unlike in the Search sub-task, the segmentation employing the Decision Trees outperforms the fixed-length segmentation for most of the measures

● A constant improvement when visual similarity is used

● A small improvement in MAP, P20, and MAP-tol measures when prosodic similarity is used

12

Results Cont.

Trans. Segm. Weights Overlap MAP P5 P10 P20 MAP-bin

MAP-tol

Sub. Fixed Visual Yes 4.1824 0.9667 0.9567 0.8967 0.3080 0.0996

Sub. DT Visual No 0.8253 0.8867 0.8567 0.7383 0.2525 0.1991

Trans. Segm. SegLen. Overlap MAP P5 P10 P20 MAP-bin

MAP-tol

Sub. DT 120s Yes 16.349 0.8400 0.8367 0.8433 0.3172 0.0515

Sub. DT 50s No 0.8028 0.7867 0.7667 0.6933 0.3199 0.2350

Search Sub-task

Hyperlinking Sub-task

13

Conclusion

14

Conclusion

● Need to be careful about overlapping of the retrieved results

● Fixed-length segmentation works well but could be outperformed by supervised ML-based methods

● Visual and audio modality can improve text-based retrieval on the subtitles and transcripts

15

Thank you

This research is supported by the Czech Science Foundation, grant number P103/12/G084, Charles University Grant Agency GA UK, grant number 920913, and by SVV

project number 260 104.

cuni at mediaeval 2014 search and hyperlinking task: search task experiments

Software