institute of computer science chair of communication networks prof. dr.-ing. p. tran-gia modeling...

9
Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies Tobias Hoßfeld, Raimund Schatz STSM 15.8.-30.9.2011 http://www3.informatik.uni-wuerzburg.de/resea rch/fia http://www3.informatik.uni-wuerzburg.de/staff /hossfeld

Upload: maurice-strickland

Post on 26-Dec-2015

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

Institute of Computer ScienceChair of Communication Networks

Prof. Dr.-Ing. P. Tran-Gia

Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies

Tobias Hoßfeld, Raimund Schatz

STSM 15.8.-30.9.2011

http://www3.informatik.uni-wuerzburg.de/research/fia

http://www3.informatik.uni-wuerzburg.de/staff/hossfeld

Page 2: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

22

Tobias Hoßfeld

QoE Issue: Waiting, Waiting, Waiting…

Stalling

Waiting Time Perception

Page 3: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

33

Tobias Hoßfeld

Research Activities Related to STSM

Application-Level Measurements• bottleneck scenario with constant bandwidth

• video characteristics• realistic stalling patterns

Monitoring and Stalling Detector• heuristics fit QoS• information extraction approach leads to exact QoE results

Optimization and Dimensioning• initial delay (GI/GI/1): T0/D<5%• bandwidth provisioning: 120%V• TCP better UDP in bottleneck

QoE Modeling• only stalling relevant, not content, demographics, etc.

• users “accept” almost no or only short stalling

• crowdsourcing supports i:lab

video player parameter,initial buffer 2sec

variable video bit rate V;high stalling frequency for V=B QoE

management

used stalling lengthin tests: 1-6sec

mapping between QoS (e.g. bandwidth B) and QoE

stalling as keyinfluence factor

Page 4: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

44

Tobias Hoßfeld

Executive Summary of STSM

Developed Test Design

Conducted Crowd-

sourcing Tests

DerivedQoE Model

Application Measurements

Laboratory Study

Remote users• ‘Reliability’ questions• App./user monitoring• Preloading of data

Data analysis• Identification of reliable

users• Key influences factors

via machine learning• Fitting with fundamental

relationships

• Mapping function: stalling and QoE• Acceptance vs. perception• Comparison crowdsourcing

with laboratory results

Realistic parametersfor temporal stimuli

• Reliable users• Different demographics• Different test setting, e.g.

longer user tests

Page 5: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

55

Tobias Hoßfeld

Crowdsourcing Workflow

Challenge: identify unreliable QoE results Countermeasures: proper test design (gold standard data, consistency questions,

content questions, application monitoring) filtering data and analyzing QoE results

Employer Worker

Submit task Pull task

Complete task

Remuneration

Crowdsourcing platform

1 2

34

5

Methods also applicable to e.g. field trials!

Page 6: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

66

Tobias Hoßfeld

Crowdsourcing: Unreliable workers

LEVEL 1: ‘reliability’ questions- wrong answers to content questions- different answers to the same questions- always selected same option- consistency questions:

specified the wrong country/continent

LEVEL 2: ‘QoE’ question- did not notice stalling- perceived non-existent stalling

LEVEL 3: ‘application/user’ monitoring- did not watch all videos completely

Mw1 Mw2Mw3 Mw4 Mw5Mw6 Mw7 FB.0

20

40

60

80

100

Per

cent

age

of w

orke

rs in

test

Level 0 Level 1 Level 2 Level 3

C1 C2 C3 C4 C5 C6 C7 Facebook0 0.2 0.4 0.6 0.8 1

0.25

0.3

0.35

0.4

0.45

0.5

0.55

ratio of fake users

SO

S p

aram

eter

a

stalling length L=1sstalling length L=3s

filter level 1

filter level 3

filter level 2

• SOS hypothesis indicates unreliable test

• Many user ratings rejected further improvements required

• User warnings („Test not done carefully“) rejection rate decreased about 50%

• Filtering may be too strict application layer monitoring not reliable

Page 7: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

77

Tobias Hoßfeld

Crowdsourcing vs. Laboratory Studies

0 1 2 3 4 5 61

2

3

4

5

number of stallings

MO

S

crowdsourcinglaboratory

4 seconds of stalling

Key influence factors on YouTube QoE stalling frequency and stalling duration determine the user perceived quality

Lab studies within ACE 2.0 at FTW’s i:Lab Similar shapes of curves in laboratory and crowdsourcing

study

Page 8: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

88

Tobias Hoßfeld

Conclusions

Most of relevant stimuli of Internet applications are of temporal nature

QoE models have to be extended in temporal dimension: stalling, waiting times, service interruptions

Gap between user perception and user acceptance, differences in lab and crowdsourcing (WG3)

‘Failed’ subjective studies for analysis of reliability (WG4) Standards to detect unreliable subjects (WG5)

Crowdsourcing appears promising Tests are conducted fast at low costs Possibility to access different user groups (in terms of

expectations/social background) But new challenges are imposed

WG1: “Web and

cloud apps”

WG2: “Crowd-

sourcing”

Page 9: Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Modeling YouTube QoE based on Crowdsourcing and Laboratory User

99

Tobias Hoßfeld

Outcome of STSM

“Quantification of YouTube QoE via Crowdsourcing” by Tobias Hoßfeld, Raimund Schatz, Michael Seufert, Matthias Hirth, Thomas Zinner, Phuoc Tran-Gia, IEEE International Workshop on Multimedia Quality of Experience - Modeling, Evaluation, and Directions (MQoE 2011), Dana Point, CA, USA, December 2011.

“FoG and Clouds: On Optimizing QoE for YouTube” by Tobias Hoßfeld, Florian Liers, Thomas Volkert, Raimund Schatz, accepted at 5th KuVS GI/ITG Workshop "NG Service Delivery Platforms", at DOCOMO Euro-Labs, Munich, Germany

“Quality of Experience of YouTube Video Streaming for Current Internet Transport Protocols” by Tobias Hoßfeld and Raimund Schatz, currently under submission at ACM Computer Communications Review; a technical report of University of Würzburg is available containing the numerical results, Technical Report No. 482: “Transport Protocol Influences on YouTube QoE”, July 2011.

" ‘Time is Bandwidth’? Narrowing the Gap between Subjective Time Perception and Quality of Experience” by Sebastian Egger, Peter Reichl, Tobias Hoßfeld, Raimund Schatz, submitted to IEEE ICC 2012 - Communication QoS, Reliability and Modeling Symposium

“Challenges of QoE Management for Cloud Applications” by Tobias Hoßfeld, Raimund Schatz, Martin Varela, Christian Timmerer, submitted to IEEE Communications Magazine, Special Issues on QoE management in emerging multimedia services

“Recommendations and Comparison of Subjective User Tests via Crowdsourcing and Laboratories for online video streaming”, intended for submission

“Impact of Fake User Ratings on QoE”, intended for Journal submission.