semantic video tool - project assignment presentation

9
Semantic Video Tool IRTM - Assignment Presentation 28th May 2015 Daniele Di Mitri

Upload: daniele-di-mitri

Post on 14-Aug-2015

73 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Semantic Video Tool - Project Assignment Presentation

Semantic Video ToolIRTM - Assignment Presentation

28th May 2015 Daniele Di Mitri

Page 2: Semantic Video Tool - Project Assignment Presentation

SVT in a nutshell

• video directory

• RESTful web application

• NLP semantic video analyzer

• video search engine

• analytics

Using the

video

transcripts!

Page 3: Semantic Video Tool - Project Assignment Presentation

Corpus

• 1391 video transcripts from TEDTalks

• from the 6 top categories

– technology, business, design, entertainement, science, global issues

• + related metainfo

– no. comments, no. views, category, length, etc.

Why TEDtalks?

• several documents

• equal length

• very good English

Page 4: Semantic Video Tool - Project Assignment Presentation
Page 5: Semantic Video Tool - Project Assignment Presentation

NLP operations

• Common NLP operations (NLTK)– Tokenization (punctuation)

– POS Tagging

– Stemming

– Chunking

– Frequent monogram, bigram, trigrams

• Automatic summaries (in 2 sentences)

• TF-IDF based search (powered by scikit-learn)

• Popular video classification

• Anaphora resolution

Page 6: Semantic Video Tool - Project Assignment Presentation

TF-IDF based

reccomending system

TED

SVT

Page 7: Semantic Video Tool - Project Assignment Presentation

Why is Monica Lewinsky popular?

comments views

datediff(now,dateup)Log

Popularity Rate =

• Idea: mark as «popular» docs with rate>15 and «unpopular» the rest

• HOW: pipeline SVM & TF-IDF to classify (work in progress)

Page 8: Semantic Video Tool - Project Assignment Presentation

Anaphora handling

• RegExp (\b(?![a-zA-Z]{2}\s)\w+|')+

Page 9: Semantic Video Tool - Project Assignment Presentation

Demo!