fraunhofer fokus espri powerpoint masterfolien...2019/11/19  · dr. christopher krauss media & data...

30
Dr. Christopher Krauss Media & Data Science Lead, Future Applications and Media Fraunhofer FOKUS, Kaiserin-Augusta-Allee 31, 10589 Berlin, Germany [email protected] KI EVALUATIONS-FRAMEWORKS IM MEDIEN-KONTEXT © Matthias Heyde / Fraunhofer FOKUS

Upload: others

Post on 04-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Dr. Christopher Krauss

    Media & Data Science Lead, Future Applications and Media

    Fraunhofer FOKUS, Kaiserin-Augusta-Allee 31, 10589 Berlin, Germany

    [email protected]

    KI EVALUATIONS-FRAMEWORKS IM MEDIEN-KONTEXT

    © M

    att

    hia

    s H

    eyd

    e /

    Fra

    un

    ho

    fer

    FO

    KU

    S

  • © E

    SA

    /NA

    SA

    Fraunhofer FOKUS

    Institute for Open Communication Systems

    Dr. Christopher Krauss

    Media & Data Science Lead

    @ Future Applications and Media

    • Since 2011 at Fraunhofer FOKUS:

    Managed multiple industry and public funded projects

    • AI, Machine Learning & Recommender Systems

    • Applications for TV & Media

    • Technology Enhanced Learning

    • 2012: Master of Science @ Beuth University

    • Since 2014: Guest Lecturer @ Beuth & TU Berlin

    • 2018: Dr.-Ing./ PhD @ TU Berlin

  • 3

    Content Creation

    Display/ Playback

  • 1 Needs Analysis

    2 Content Creation

    3 Adding Meta-Data

    4 Content Encoding & Storage6 Content Delivery

    5 Offering & Discovery

    7 Display/ Playback

    8 Usage Analytics

    MEDIA LIFECYCLE

    See also: https://www.fokus.fraunhofer.de/en/fame/workingareas/ai

  • (Smart Learning Recommendations)

    2 TV Audience Predictions

    Agenda

    1 Deep Encode

  • How to evaluate AI components?

    6

    ModelInput Output

    Me

    asu

    rem

    en

    ts

    Da

    ta S

    plittin

    g

    Data

    Evaluation Framework

  • 7

    n-fold cross validation to summarize results:

    A Typical Evaluation Framework

    Training data set

    Original data set

    Training data setValidation data set

    Training data set Training data setValidation data set

    Training data set Training data set Validation data set

    1.

    2.

    3.

    Root Mean Squared Error …Mean Absolute Error

  • 1 Deep Encode

  • Motivation

    9

    Low complexity /

    High redundancy

    High complexity /

    Less redundancy

    Animation

    Countryside

    Action

    Sport

    Animation

    Countryside

    Action

    Sport

    See also: https://www.fokus.fraunhofer.de/en/fame/deep-encode

  • Context Aware Encoding

    AI-based image processing for content analysis

    • Automatic Scene Detection

    • Learning and suggestion of optimal settings

    Deep Learning for appropriate Encoding Ladders

    • Prediction of pairs through Neural

    Networks (NN) & Random Forest Regression (RFR)

    Objective and subjective quality measurements

    • PSNR, VMAF & SSIM

    Automated Encoding Chain

    • Per Title, Per Segment, Per Scene Encoding

    Solution

    Co

    nte

    xtA

    wa

    re E

    nco

    din

    g

    (plu

    s p

    layb

    ack

    an

    d n

    etw

    ork

    sta

    tistics)

    Pe

    r S

    ce

    ne

    En

    co

    din

    g

    Co

    nve

    ntio

    na

    l&

    Pe

    r T

    itle

    En

    co

    din

    g

  • How are costs measured?

    • Storage

    • Traffic

    What is a „good“ quality? Is therea quality reference?

    Some Metrics:

    • Peak Signal-to-Noise Ratio(PSNR)

    • Structural Similarity Index (SSIM)

    • Video Multimethod Assessment Fusion (VMAF)

    Metrics

    11

    Quality

    gain

    Optional

    representatio

    n to achieve

    a VMAF

    score of 93

    See also: https://www.fokus.fraunhofer.de/en/fame/deep-encode

  • See also: https://www.fokus.fraunhofer.de/en/fame/deep-encode

    Model

    Trained Model

    Predicted Bitrate-

    VMAF-Pairs

    (Encoding Ladders)

    Served Model

    Content Analysis

    (Feature Extraction)

    Video metadata,

    bitrate, VMAF

    Video Features

    Classic

    Per-Title Encoding

    (Test-Enocdes)

  • 162 Reference Videos; 12,960 Test-Encodes; 10-fold cross-validation

    Quality:

    Average VMAF score

    Traffic:

    Average required Bitrate in Kbit/s

    Storage:

    Size in MB

    Reference Encoding Ladder 84.1 2136 816.2

    Quality Optimized Per-Title

    Encoding Ladder

    88.6

    (+4.5 | 5.4%)

    2407

    (+271 | 12,7%)

    919.6

    (+103.4 | 12.7%)

    Storage Optimized Per-Title

    Encoding Ladder

    86.4

    (+2.2 | 2,7%)

    1360

    (-776 | 36,3%)

    346.5

    (-469.6 | 57.6%)

    Per-Scene Encoding Ladder 84.7

    (+0.6 | 0.7%)

    930

    (-1206 | 56,5%)

    233.5

    (-582.7 | 71.4%)

    Results

    13

    On average ~55% total delivery savings

    On average ~70% total storage savings

    On average ~60% higher quality scores for same bitrates

    0% Perceptual quality loss on highest quality

    See also: https://www.fokus.fraunhofer.de/en/fame/deep-encode

  • 2 TV Audience Predictions

  • See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

    Motivation

    15

  • • Cloud-based real-time system for measuring TV reach

    and media usage in (mobile) applications

    • Tailor-made backend with extensive administration functions

    • Highly scalable, automated overall system

    • Detailed reporting and analytics toolkit

    Measurement Solution

    16

    See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

  • Data

    17

    See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

  • Forecasting Solution

    18

    Long short-term memory (LSTM)

    Multi-Step Ahead Lower Upper Bound Forecasting

    See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

  • Training based on

    • two years of

    • time-series data

    • on a second basis

    • incl. program meta-data

    Data Spliting

    • Fixed time-window cross-validation

    Custom Metric combining:

    • Multi-step ahead forecasting

    • Lower Upper Bound Estimations

    See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

    Results

    19

    Average drop-off in

    comercial breaks

    Forecast of a commercial break

    Naive Approach

    MASE = 5.08LSTM Forecast

    MASE = 1.92

    Error reduction by 62.2%

  • 20

    Playout-Side Ad Insertion Solution

    See also: https://www.fokus.fraunhofer.de/en/fame/hbbtv-bm

  • In advance, define and describe well:

    1. Model: What is the problem and the goal of the approaches/ objects of investigation?

    2. Data: Which data set is used as input and how is the data set structured?

    3. Evaluation Framework: How are training and test data split?

    4. Evaluation Framework: What cross-validation method is used?

    5. Evaluation Framework: What metrics are used to prove whether an approach works

    “adequately”? Can the results be compared with other approaches?

    Inspired by: P. G. Campos, F. Díez, and I. Cantador. Time-aware recommender systems: a comprehensive survey and analysis of

    existing evaluation protocols. User Modeling and User-Adapted Interaction, 24(1-2):67–119, 2014.

    A good Evaluation-Framework? – Methodological Questions:

  • 22

    May 05– 06, 2020, Berlin

    9th FOKUS Media Web Symposium – And Still Diving Deeper

    The FOKUS Media Web Symposium has been taking place since 2010 and developed into

    a well received expert meeting for all topics related to video technologies.The conference,

    tutorials and workshops of the 9th FOKUS Media Web Symposium 2020 will cover deep

    insights in internet delivered media, discussing the newest developments in media meets

    AI, media meets 5G and media meets scale. In between, coffee breaks and lunch offer the

    opportunity to network and visit demos and exhibits of Fraunhofer FOKUS and event

    partners.

    www.fokus.fraunhofer.de/go/mws

    9th FOKUS MEDIA WEB SYMPOSIUM

  • © E

    SA

    /NA

    SA

    Fraunhofer FOKUS

    Institut für Offene Kommunikationssysteme

    Thank you for your attention!

    Dr.-Ing. Christopher KraussMedia & Data Science LeadFuture Applications and MediaTel. +49 (30) 34 63 – 72 [email protected]

    Fraunhofer Institute for Open Communication Systems FOKUSKaiserin-Augusta-Allee 31 10589 Berlin, Germany

    Tel: +49 (30) 34 63 – 7000Fax: +49 (30) 34 63 – 8000www.fokus.fraunhofer.de

  • (3) Smart Learning Recommendations

  • Motivation

    25

  • 1. Definition of a time-dependent evaluation framework including a novel measurement value for

    timeliness

    2. Collection of appropriate learning activity data for offline evaluations on historical data

    3. Realization and comparison of Recommender System algorithms

    Solution

    26

  • Data

    27

    CONTENT ACCESSES DURATION OF LEARNING

    TIME OF THE LESSON

    REQUIRED PREVIOUS KNOWLEDGE

    TEST RELEVANCE

    SELF-ASSESSMENTS EXERCISE ANSWERS

    FORGETTING MODELOTHER CLASSMATES DATA

  • • Instead of an n-fold cross-validation a

    Time-window cross-validation was applied

    • Increasing time-window

    • Fixed time-window

    Evaluation Framework

    28

    Source: Krauss, Christopher. Time-Dependent Recommender Systems for the Prediction of Appropriate Learning Objects. Dissertation

    at TU Berlin, Germany, June 29, 2018. http://dx.doi.org/10.14279/depositonce-7119

  • • The new metric Timeliness Deviation

    (MATD) is introduced to measure the time

    deviation between time of recommendation

    and the time when an item is relevant

    Source: Krauss, Christopher. Time-Dependent Recommender Systems for the Prediction of Appropriate Learning Objects. Dissertation

    at TU Berlin, Germany, June 29, 2018. http://dx.doi.org/10.14279/depositonce-7119

    Metrics

    29

  • Results

    30

    Source: Krauss, Christopher. Time-Dependent Recommender Systems for the Prediction of Appropriate Learning Objects. Dissertation

    at TU Berlin, Germany, June 29, 2018. http://dx.doi.org/10.14279/depositonce-7119