agile lab rai - bnova...smart(er) content reccomendation thanks for contacts or questions:...
TRANSCRIPT
Agile Lab – RAIA NEW KIND OF VIDEO OBJECT
RECOGNITION SYSTEM
Dr. Alberto MESSINA – Researcher @CRIT-RAI
Alberto Firpo – CEO @Agile Lab
Agile Lab - What we do
SCALABLE TECHNOLOGIES
MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE
Our background in image/video processing
Computer Vision “Mapping” Algorithms+ Deterministic Matching
(es.: SIFT)
Computer Vision “Mapping” Algorithms+ Deep Learning Decision
(es.: Selective Search + CNN)
Fully Deep Learning based(es.: CNN, fast, SSD, YOLO)
More FlexibilityMore PerformanceMore Use Cases
Performance drivers and trade off• Near Real Time vs Batch processing
• Frame frequency• Image Spatial Scaling
• Mean Average Precision
Some use cases – Deep Sat
• Precision: 0.98• Recall: 0.91• MAP: 0.78
Deep Sat: true positive
Deep Sat: true negative
Some use case - Deep Logo
Precision and recall per class:
Bmw: P: 0.97 R: 0.91Ferrari: P: 0.98 R: 0.85Ford: P: 1 R: 0.81HP: P: 1 R: 0.88nVidia: P: 1 R: 0.83Rolex: P: 0.97 R: 0.81Shell: P: 0.98 R: 0.80Apple: P: 1 R: 0.93
Online services today
Too much content …
METADATA
How to achieve automated contentunderstanding*?
* at reasonable costs …
"Gina Lollobrigida sitting in a dark room"
All rights reserved – RAI Radiotelevisione Italiana
"faces": [{"age": 30,"gender": "Female","faceRectangle": {
"left": 352,"top": 186,"width": 55,"height": 55
}},{"age": 5,"gender": "Male","faceRectangle": {
"left": 436,"top": 247,"width": 47,"height": 47
}}
]
"tags": ["person","sitting","photo","woman","posing","front","holding","young","black","man","camera","white","girl","dark","standing","wearing","table","room","phone","living","computer","shirt"
]
shot at Santa Maria Antiqua
17
Why?
• Discovery of latent information throughvisual cues
• Impossible to annotate content from allpossible point of views
• Discovery of unexpected correlations• Linking and enriching
• Enabling user interaction in smartenvironments
• Tourism, culture
18
AgileRAIarchitecture
How?
19
AgileRAI video processing pipeline
MPEG CDVS
Extractor
CDVS
MatchingImage
DatabaseM
atc
hin
g
labels
CDVS Descriptors
Monitoring
Labels + timestampsSemantic
annotationInput video
(RAI1_YYYY-MM-DDThh:mm:ss.s,
Rialto_Bridge)
(Rialto_Bridge)
Semantic annotation pipeline
20
Video processing pipeline
Semanticenrichment
Labels +
timestamps
LOD repository
Triplestore‘Detection’ ontology
(RAI1_YYYY-MM-DDThh:mm:ss.s,
Rialto_Bridge)(dbpedia:Rialto_Bridge)
(owl:sameAs http://dbpedia.org/resource/Rialto_Bridge)
UC1: video
browsing
21
UC2: dynamic semantic tagging
22
Efficient delivery is good but not enough
Content understandingand organisation is the key
Understanding, linking, enriching
Smart(er) content reccomendation
ThanksFor contacts or questions: