multimedia communications technical committee...
TRANSCRIPT
IEEE COMSOC MMTC Communications – Frontiers
http://mmc.committees.comsoc.org 1/56 Vol.13, No.1, January 2018
MULTIMEDIA COMMUNICATIONS TECHNICAL COMMITTEE http://www.comsoc.org/~mmc
MMTC Communications - Frontiers
Vol. 13, No. 1, January 2018
CONTENTS
Message from the MMTC Chair ......................................................................................3
SPECIAL ISSUE ON QoE Evaluation and Control in Immersive Multi-modal
Multimedia Applications ...................................................................................................4
Guest Editors: Pedro A. A. Assunção 1, 2 and Erhan Ekmekcioglu 3 ...................................4 1 Instituto de Telecomunicações, Portugal ..........................................................................4 2 Instituto Politécnico de Leiria, Leiria, Portugal ...............................................................4 3 Loughborough University London, United Kingdom ........................................................4
[email protected]; E.Ekmekcioglu@lboro.ac.uk....................................................................4
Evaluating QoE of Immersive Multisensory Experiences .............................................6
Niall Murray1,2, Yuansong Qiao2, Conor Keighrey1, Darragh Egan1, Débora Pereira
Salgado1, Gabriel Miro Muntean3, Christian Timmerer4, Oluwakemi A Ademoye5,
Gheorghita Ghinea6, Brian Lee2 ..........................................................................................6 1Dept. Of Electronics & Informatics, Faculty of Engineering & Informatics, Athlone
Institute of Technology, Ireland ...........................................................................................6 2Software Research Institute, Athlone Institute of Technology, Ireland ..............................6 3School of Electronic Engineering, Dublin City University, Ireland...................................6 4Dept. Of Information Technology, Alpen-Adria-Universitat Klagenfurt, Austria .............6 5Faculty of Architecture, Computing and Engineering, University of Wales Trinity St.
David, UK ............................................................................................................................6 6Dept. Of Computer Science, Brunel University, United Kingdom .....................................6
Psychophysiological Methods for Quality of Experience Research in Virtual Reality
Systems and Applications ................................................................................................14
Miguel Barreda-Ángeles, Rafael Redondo-Tejedor, Alexandre Pereda-Baños ................14
Eurecat – Technology Centre of Catalonia, Barcelona, Spain .........................................14
[email protected]; [email protected];
[email protected] .............................................................................................14
QoE Concerns and Measurement in Augmented Reality Applications ......................21
Patrick Seeling ...................................................................................................................21
Department of Computer Science, Central Michigan University, MI, USA
pseeling@ieee.org..............................................................................................................21
Emerging levels of immersive experience in MPEG-I video coding ...........................24
Dragorad Milovanovic, Dragan Kukolj ............................................................................24
Dept. of Computer Engineering, Faculty of Engineering, University of Novi Sad, Serbia
24
dragan.kukolj@rt-rk.com...................................................................................................24
IEEE COMSOC MMTC Communications – Frontiers
http://mmc.committees.comsoc.org 2/56 Vol.13, No.1, January 2018
Trends in QoE for immersive experiences.....................................................................27
Andrew Perkis, Sebastian Arndt ........................................................................................27
Department of Electronic Systems .....................................................................................27
The Norwegian University of Science and Technology, Trondheim, Norway ...................27
[email protected], [email protected] ................................................................27
SPECIAL ISSUE ON Content Caching and Sharing in Wireless Networks .............33
Guest Editors: ....................................................................................................................33
Zheng Chang, University of Jyväskylä, Finland ................................................................33
Zhenyu Zhou, North China Electric Power University, China ..........................................33
[email protected], [email protected] ............................................................33
Content Caching and Push in Small Cells with Renewable Energy ...........................36
Jie Gong .............................................................................................................................36
School of Data and Computer Science, .............................................................................36
Sun Yat-sen University, Guangzhou 510006, China ..........................................................36
Energy Efficiency Analysis of 5G Content Caching System ........................................41
Di Zhang1,2, Zhenyu Zhou2, Zhengyu Zhu1, Shahid Mumtaz4 ............................................41 1School of Information Engineering, Zhengzhou University, Zhengzhou, 450-001, China.
41 2Department of Electric and Computer Engineering, Seoul National University, Seoul,
151-742, Korea. .................................................................................................................41 3State Key Laboratory of Alternate Electrical Power System with Renewable Energy
Sources, School of Electrical and Electronic Engineering, North China Electric Power
University, Beijing, 102206, China....................................................................................41 4Instituto de Telecomunicações, Aveiro, 1049-001, Portugal. ..........................................41
Energy-Efficient Design for Latency-tolerant Content Delivery Networks ...............46
Thang X. Vu, Lei Lei, and Satyanarayana Vuppala ..........................................................46
The Interdisciplinary Centre for Security, Reliability and Trust (SnT), U niversity of
Luxembourg, 29 Avenue John ............................................................................................46
F. Kennedy, Luxembourg. Email: {thang.vu, lei.lei, satyanarayana.vuppala}@uni.lu ....46
Cooperative Content Caching and Distribution in Multihop D2D-V2V Networks ...51
Yahui Wang*, Zhenyu Zhou*, Houjian Yu*, and Chen Xu* ..............................................51
*School of Electrical and Electronic Engineering, North China Electric Power
University, Beijing, China..................................................................................................51
MMTC OFFICERS (Term 2016 — 2018) .....................................................................56
IEEE COMSOC MMTC Communications – Frontiers
http://mmc.committees.comsoc.org 3/56 Vol.13, No.1, January 2018
Message from the MMTC Chair
Dear MMTC colleagues and friends,
I hope this new year 2018 finds you well. On behalf of all the MMTC officers, I would like to wish all
of you and your families a very Happy New Year 2018. May this year bring you all health, happiness,
and prosperity.
I would also like to take this opportunity to thank those who were able to attend the MMTC meeting
at Globecom 2017. The Communication Software, Services and Multimedia Applications Symposium
(CSSMA) at Globecom and ICC is sponsored by MMTC and I encourage all of you to continue to be
actively involved in CSSMA and submit papers there. The next CSSMA will be at ICC 2018 in Kansas
City, Missouri, USA from May 20-24, 2018, and we hope to see you all there.
The MMTC officers would like to encourage members to be actively involved in the TC as well as
help recruit new members. Membership is open to all those who are interested, and more information
can be found at the TC website, http://mmc.committees.comsoc.org. MMTC provides members the
opportunity to actively serve the community by submitting nominations for associate editorship to
journals, special issue proposals, conference chairs, and ComSoc distinguished lecturers. In addition,
the TC can help assist members in the process for elevation to senior and fellow grades. MMTC will
also soon be seeking new officers since the current officer’s term will end in May 2018.
I would also like to point out that Dr. Wenwu Zhu, editor in chief for the IEEE Transactions on
Multimedia (TMM), has a call for nominations for new associate editors. As a sponsoring TC for TMM,
MMTC will be submitting up to 3 candidates for consideration to the TMM steering committee. We
encourage members who are interested to submit nominations by January 31, 2018 following the
instructions in the e-mail sent by Dr. Shiwen Mao. Nominations should include a two page CV,
supporting letters from two senior leaders in the community, and a one page summary of the member’s
past involvement with MMTC.
There will also soon be a call for two service awards. Please be on the lookout for the call and I
encourage you all to submit nominations.
Recently, the TC submitted nominations for Dr. Wanqing Li to serve as a MMTC representative to the
ICME steering committee, Dr. Liang Zhou to serve as the MMTC representative to the CCNC steering
committee, and Dr. Guosen Yue and Dr. Qing Yang as Globecom 19 CSSMA symposium co-chairs.
In addition, Dr. Shiwen Mao, who is the current MMTC chair, was elected as the Chair of the
Distinguished Lecturers' Selection Standing Committee as well as the Vice-Chair for the Technical
Educational and Activities Committee (TEA-C) for IEEE ComSoc for the 2018-2019 term.
Congratulations to all!
I would also like to thank the special issue editors and publication board for their hard work in
publishing this issue of MMTC Frontiers.
Sincerely,
Sanjeev Mehrotra
Vice-Chair, Multimedia Communications Technical Committee, IEEE Communications Society
IEEE COMSOC MMTC Communications - Frontiers
http://mmc.committees.comsoc.org 4/56 Vol.13, No.1, January 2018
SPECIAL ISSUE ON QoE Evaluation and Control in Immersive Multi-modal Multimedia
Applications
Guest Editors: Pedro A. A. Assunção 1, 2 and Erhan Ekmekcioglu 3
1 Instituto de Telecomunicações, Portugal 2 Instituto Politécnico de Leiria, Leiria, Portugal
3 Loughborough University London, United Kingdom
[email protected]; [email protected]
The overarching goal of immersive multimedia environments is to enable life-like experiences and interactions
between people and other media objects, in other words, to cement the gap between the physical and virtual entities.
Amongst the primary objectives is the facilitation of the sense of presence, through the use of multiple sensory
channels like vision, audio, haptics and (more recently) olfaction and taste. Applications that are based on immersive
and multimodal multimedia are used in multiple areas, including but not limited to media and entertainment, gaming,
healthcare (like remote robotic surgery, emergency remote intervention), transport (driver assistance, intelligent
navigation), commerce, digital manufacturing and design, and training. Research and development in various forms
of immersive multimedia is not new. Underpinning technologies, such as multi-camera and light-field capture and
processing systems; coding, distribution, and rendering of three-dimensional environments on high-end display
systems (video and spatial audio); and novel human-computer interfaces, have been widely researched. But especially
with the surge of consumer-grade Virtual Reality (VR) headsets in the past few years, and increasing availability of
mobile Augmented Reality (AR) systems that are powered by advanced object recognition, tracking and localisation
techniques, there has been a considerable rise in the innovation of new use cases for immersive multimedia.
Despite many advances to date, many challenges still remain to be addressed to ensure the practicality and widespread
adoption of emerging immersive experiences. The factors influencing the Quality of Experience (QoE), which has
been widely studied in the context of traditional multimedia applications, needs to be fully understood in the context
of multi-sensory immersive experiences. Not only the video, graphics and audio fidelity are included in the QoE, but
also the level of affective engagement, feel of presence, comfort of use without issues related to cyber-sickness,
intuitiveness of interaction means, and effectiveness of the storytelling approaches count towards the QoE of
immersive experiences. The impact of the constraints associated with the entire processing pipeline, like delay,
bandwidth, devices (e.g., graphical processing power, display resolution, field-of-view, multi-sensory
synchronization) should be modelled to predict the QoE of immersive experiences. On the other hand, the power of
other physiological signals that can be captured via lightweight wearable sensors in monitoring the affective state of
users and indirectly inferring the levels of real-time QoE should be harnessed in the design and control of immersive
applications. In this Special Issue, authors highlight their research findings and perspectives on the topic of Quality of
Experience in immersive multi-sensory multimedia applications.
The first contribution titled “Evaluating QoE of Immersive Multisensory Experiences” by N. Murray, et. al., outlines
a comprehensive overview of the authors’ past works addressing the evaluation of Quality of Experience in immersive
multi-sensorial experiences, in particular olfaction-enhanced multimedia. Authors suggest that understanding user’s
QoE of olfaction based applications is not trivial, and describe a number of research challenges for the multimedia
community in that respect.
The second contribution titled “Psychophysiological Methods for Quality of Experience Research in Virtual Reality
Systems and Applications” by M. Barreda-Ángeles, et. al., provides an outline of the use of psycho-physiological
sensing and measurement technologies on the QoE research associated with immersive applications. The authors
suggest that aspects such as spatial presence, attentional and emotional involvement, cognitive effort, or stress, are
important to explain how a user feels experiencing an immersive service, and psychophysiological-based methods
may be the best option currently available for them to be understood.
In his paper titled “QoE Concerns and Measurement in Augmented Reality Applications”, P. Seeling sheds light on
the current efforts in determining and modelling the QoE associated with Augmented Reality applications. The author
indicates that when determining the QoE in AR scenarios, media fidelity as well as the contrast and colours as a
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 5/56 Vol.13, No.1, January 2018
consequence of the overlay of content with the real world should be considered. The combination of them is typically
neither readily determinable nor steady and therefore requires new considerations for perceptual models. Furthermore,
the use of brain-computer interfaces for measuring the QoE of AR applications is discussed.
The fourth contribution titled “Emerging levels of immersive experience in MPEG-I video coding”, written by D.
Milovanovic and D. Kukolj provides a summary of the current activities related to the MPEG development and
research activities of the emerging immersive media technologies. They briefly introduce the readers to the exploration
project MPEG-I (Coded representation of immersive media).
Finally, in their paper titled “Trends in QoE for immersive experiences”, A. Perkis and S. Arndt focus on the
importance of novel storytelling methods to fuel the success of emerging immersive media applications. The authors
indicate that digital storytelling is all about enhancing the QoE in the rich digital media, changing the passive viewer
into a participating, engaged and immersed user. The methods vary and include adding interactivity, increasing
dimension, mixing realities all the way through to creating content that triggers more senses. Their research focusses
on the creative aspects of new digital media, designing new platforms for combining art and technology and creating
immersive and interactive content in public spaces (an example project described in detail) and the measurement of
the QoE.
With this Special Issue we have no intent to present a complete picture on the state of the QoE measurement and
control for immersive multimodal multimedia services. However, we hope that the presented papers provide the
audience with a brief tutorial and valuable insight into the persisting challenges in the area, and predictions for the
future research.
Our special thanks go to all authors for their precious contributions to this Special Issue. We would also like to
acknowledge the gracious support from the Board of MMTC Communications - Frontiers.
Pedro A. A. Assunção received the Licenciado and M.Sc. degrees from the University of Coimbra,
in 1988 and 1993, respectively, and the Ph.D. in Electronic Systems Engineering from the University
of Essex, in 1998. He is currently professor of Electrical Engineering and Multimedia
Communication Systems at the Polytechnic Institute of Leiria and a senior researcher at the Institute
for Telecommunications, Portugal. His current research interests include high efficiency and 360-
degree, multi-view video and light field coding, multiple description and robust coding, error
concealment and quality evaluation. He is a senior member of the IEEE.
Erhan Ekmekcioglu received his Ph.D. degree from University of Surrey, UK, in 2010. Between
2010 and 2017 he worked as a post-doctoral researcher in University of Surrey and Loughborough
University, respectively, specialising in the field of video processing and communications. He is
currently a senior lecturer at the Institute for Digital Technologies, Loughborough University
London. His research interests include 2D/3D and multi-view video processing, coding, and
transport, quality of experience, immersive and interactive multimedia. He co-authored around 50
peer-reviewed research articles, book chapters, and a book on 3D-TV systems.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 6/56 Vol.13, No.1, January 2018
Evaluating QoE of Immersive Multisensory Experiences
Niall Murray1,2, Yuansong Qiao2, Conor Keighrey1, Darragh Egan1, Débora Pereira Salgado1,
Gabriel Miro Muntean3, Christian Timmerer4, Oluwakemi A Ademoye5, Gheorghita Ghinea6,
Brian Lee2
1Dept. Of Electronics & Informatics, Faculty of Engineering & Informatics, Athlone Institute of
Technology, Ireland 2Software Research Institute, Athlone Institute of Technology, Ireland
3School of Electronic Engineering, Dublin City University, Ireland 4Dept. Of Information Technology, Alpen-Adria-Universitat Klagenfurt, Austria
5Faculty of Architecture, Computing and Engineering, University of Wales Trinity St. David, UK 6Dept. Of Computer Science, Brunel University, United Kingdom
1. Introduction
Recent technological advances have led to a profound increase the quality of multimedia content, in addition to
different ways in interacting with and consuming it. Technologies such as Virtual Reality (VR), 360-degree video,
Augmented Reality (AR) and 3D audio aim to support novel immersive and interactive experiences. However, such
approaches towards immersion only stimulate two of the five human senses. Opportunities now exist to target the
human senses outside the traditional audio and visual, to include tactile, olfaction, and gustatory. Hence, it is possible
to develop applications that consider inputs across all senses, i.e., truly immersive and interactive multimedia
experiences. Such experiences may be influenced by the integration of different media formats, sensory modalities,
the context, the user and varying communication/delivery mechanisms; with the aim to increase the perceptual user
and quality of experience. Indeed such experiences are only possible by a multidisciplinary research approach which
involves (and is not limited to) multimedia, psychology (including experimental), human-computer interaction, social
computing and electronics among many others. In addition, the range of applications for virtual reality, augmented
reality, 360-degree video and multisensory experiences is quite diverse with related and unique research challenges.
Such domains include tele-presence, training/education, health, tourism, entertainment etc. Critical to the success of
these immersive and multisensory experiences (IMEx), is the fact that on a per application basis, it is crucial to
understand the perceptual user and the quality of experience (QoE).
In this context, the user QoE of IMEx is complex to model and as a research problem, is multifactorial and
multidimensional. QoE is defined in the QUALINET Whitepaper [1] as: “the degree of delight or annoyance of a
person whose experiencing involves an application, service, or system. It results from the person’s evaluation of the
fulfillment of his or her expectations and needs with respect to the utility and/or enjoyment in the light of the person’s
context, personality and current state”. QoE is a theoretical framework, it is a measurement-centered reflection of a
users’ perception of an application, system, or service. Therefore QoE aligns well with the multifactorial and
multidimensional challenge of modelling user perception of IMEx applications or services. A persons QoE in affect
by influencing factors, which are defined in [2][3] as being “any characteristic of a user, system, service or context
who actual state or setting may have influence of the QoE of the user”. There are a few articles that categorize such
IF’s in different manners [1][3][4] with commonality in the actual factors identified and explained. In [1] as per Fig.
1, the IF’s that effect user QoE are a function of the traditional QoS (device, network, content) metrics and
social/psychological aspects with an overarching categorization within the system, user and context factors.
In terms of olfaction-enhanced multimedia, the literature provides a number of key articles of how olfaction is and
can be employed in future multimedia applications [5][6][7][8][9][10][11][12]. Ghinea and Ademoye in [5] reviewed
works that employed olfaction as a media component in the areas of virtual reality and entertainment. They also
proposed potential future research directions of synchronization, olfactory display development and content
association. The authors of [6][7] presented the use of and potential for olfaction-enhanced multimedia applications
in areas such as education, training, e-health and virtual tourism as well as providing an overview of various
commercially available multisensory technologies. In [8], ten categories of smell experience were defined based on
feedback obtained from over 400 participants in a user study. Considering the rapid development of olfactory sensor
and display technology, olfaction-based multimedia applications are a realistic possibility
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 7/56 Vol.13, No.1, January 2018
Figure 1. Factors influencing user Quality of Experience, adapted from [1]
technically and across a wide variety of application domains. [9][10][11] highlighted opportunities and challenges
around sensorial touch, taste and smell. They outlined key challenges around understanding sensory system processing
within context of HCI: which tactile, olfactory and gustatory experiences HCI designers should design for; designing
interfaces for sensory inputs e.g. olfaction but also interfaces that integrate multisensory experiences i.e. taste & smell.
Finally in [12], a suite of olfaction enhanced multimedia research challenges ranging from standardization, effects of
intensity and duration, application domains, delivery, display development as well as a key problem of methodologies
to evaluate QoE of olfaction enhanced multimedia was discussed.
2. IMEx QoE Methodologies
Evaluating user QoE of traditional media components is non-trivial and the addition of immersive and multisensorial
media components increases this challenge. No standardized methodology exists to conduct subjective quality
assessments of immersive & multisensorial media applications [12]. To date researchers have employed different
aspects of audiovisual standards [13][14] to assess user QoE. In terms of IMEx, the literature reports quality
assessment which can be based on two broad categories: implicit and explicit evaluations [15]. Explicit evaluations
require the user to report, post the experience, perceived quality using predefined scales (e.g. mean opinion score), or
open-ended questions. This has been dominant in efforts to capture user QoE of IMEx [9][16][17]. This said, the
literature also highlights numerous issues with explicit evaluations: time consuming; bias; and inaccuracies in
responses due to external factors [18][19].
Implicit evaluations aim to analyze the relationship between captured physiological measures and user QoE. They
have gained traction, in particular due to their real-time continuous nature. In [20], Engelke et al provide a survey of
psychophysiology-based QoE assessment across a range of multimedia applications. They highlight the advantages
and possible opportunities of capturing physiological data along with the psychological bases of perceptual and
cognitive processes. Further discussion is available is also available in [21][22]. Also in [21], the use of interaction
measures learning, effort required, response times, interaction, errors and satisfaction are employed. These all fall
within the human, system, and context domains of QoE and provide valuable objective data on user QoE from an
interaction perspective. In [12], specific to olfaction enhanced multimedia, the authors highlighted issues researchers
face from numerous perspectives including applicability (or lack of) existing audiovisual standards to evaluate user
QoE and lack of result comparability due to varying approaches, specific requirements of olfactory-based
multisensorial media applications, and novelty associated with these applications. Finally, based on the diverse
approaches in the literature and the collective experience of authors, [12] provides a tutorial and recommendations on
the key steps to conduct olfactory-based multisensorial media QoE evaluation.
3. IMEx QoE studies
In recent times, QoE studies involving IMEx have started to emerge. As mentioned earlier, these have typically fallen
within either implicit or explicit assessment approaches. In this section, we highlight some efforts we have made in
the recent past with respect to QoE studies of IMEx. Initially our work on user perception of olfaction-enhanced
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 8/56 Vol.13, No.1, January 2018
multimedia was inspired by that of Ademoye et al. [22][24][25][26][27]. They instantiated a model first proposed by
Wikstrand [28]. This model considered defined multimedia quality at technical and user perspectives. As such it
proposed consideration of quality at three levels: network, media and content. The network level considered the effects
of transmission over communications networks on user perceptual quality; the media-level considered the influence
on perceptual quality of how the media is coded for transport; and the content-level is concerned with the transfer of
information and level of satisfaction between the video media and the user, i.e. level of enjoyment [28].
Our work to-date [29]-[36] has complimented and extended this by considering network level effects like delay and
jitter [29][33], defining a user profile based on age, gender and culture[30][31][32]; by analyzing the influence on
QoE of scent type [35] and audio masking effects [34] (both content level); and finally how multiple olfactory streams
[33] impact user QoE. The results to date have revealed a number of interesting findings. All of the studies were
performed with respect to olfaction enhanced multimedia QoE have involved explicit assessment approaches,
borrowing facets from various ITU-T standards [13][14] and various ISO sensory analysis standards [37][38][39]. In
this context, the interested reader can view recommendations we proposed on how to perform olfaction enhanced
multimedia QoE evaluations in [12] with respect to assessor screening and training; olfaction-enhanced multimedia
equipment; laboratory and experimental design as well as methodology.
In terms of the impact on user QoE of network influencing factors (delay and jitter) [29][33], we found that users were
quite tolerable to large inter media skew levels as per Fig. 2. Fig. 2 show the user responses to the question of whether
or not skew levels between olfaction and visual media were annoying (with 5 meaning they did not detect any skew,
4 meaning that they detected skew but it was not annoying; 3 slightly annoying and 3-1 varying degrees as of
annoyance with 1 being very annoying). As per Fig. 2, assessors were willing to accept skew levels of +10s when
olfaction was presented before video and 5s when olfaction was presented before video.
In terms of the influence of human factors on olfaction enhanced multimedia QoE, we considered age, gender and
culture as per Fig. 3, Fig. 4 and Fig. 5. In terms of rating the impairment caused by the existence of a synchronization
error, Fig. 3 details the assessor annoyance at varying levels of skews. Assessors rated olfaction before video more
annoying than olfaction after video. The female group were much more sensitive to skew with olfaction before video
than the male group, with both groups reported similar annoyance to skew with olfaction after video. As per Fig. 4,
the younger female group were the most sensitive to skew, with the male (20-30 yrs and 30-40 yrs) and female (30-
40 yrs) group similar in terms of the their rating of skews. The two older groups were the most tolerant to skew.
In terms of defining the temporal boundaries for synchronizing olfactory and video media based on human factors,
we define “in-sync” and “out-of-sync” regions. These boundaries are based on the findings that users were tolerable
to certain skew levels, they defined as “not annoying” (i.e. An impairment rating of above 3.5). It also considers
differences in perception based on gender, age and nationality. The in-synch region spans between a maximum skew
of 0s to -5s/-15s when olfaction is ahead of video, and a maximum skew of 0s to +10s/+15s when olfaction is after
video depending on the age and gender and nationality of the user.
In terms of considering the influence of content level factors i.e. scent type (Fig. 6) and the presence of audio (Fig. 7),
some interesting observations can be made. Firstly from Fig. 6, for each of the scent types, whether pleasant or
unpleasant, it is clear that assessors found scents presented after video less annoying than before video. This is
particularly exaggerated with the “unpleasant” scent types such as foul and burnt. As per [35], 21 statistically
significant differences exist across the different skew levels per scent type. For 15 of the 21 of these; one pleasant and
one maybe unpleasant/pleasant or unpleasant scent type were being compared. Further work on the reasons for this
are required, but initial investigation suggests that the content of the video scene was emphasized with the scent. In
terms of temporal boundaries for synchronization of olfaction enhanced multimedia considering scent type, different
temporal boundaries exist per scent. Again if we consider a MOS of 3.5 as the minimum required rating, for the foul
scent type, presentation from 0s up to +15s was not annoying, whereas, small skew levels (e.g. -5s) when olfaction
was presented before video were below this threshold.
The findings of Fig. 2 and Fig. 7 compare the differences in annoyance levels for users when video only was enhanced
with olfaction (Fig. 2) and audiovisual media was enhanced by olfaction (Fig. 7) across the various skew levels. As is
clear when comparing both figures, users were much more sensitive to skew level in the absence of the audio. In
addition, the results favor the no audio component presentation when the inter-media presentation is in synch. These
results suggest the presence of the audio media component of a multisensorial stream acts as a mask
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 9/56 Vol.13, No.1, January 2018
Fig. 2. Analysis of annoyance level per skew between olfaction Fig. 3. Gender analysis of annoyance level per skew with
and visual media with confidence interval based Confidence intervals based on a 99% confidence level [30].
on 99% confidence level. [29]
Fig. 4. Gender/Age Analysis of Fig. 5. Nationality based analysis of Annoyance Level per Skew Annoyance Level per Skew [30] with confidence intervals based on a 99% confidence level [32]
Fig. 6: Assessor perception of skew type considering scent type [35]. Fig. 7: Assessor perception of skew between olfaction and
audiovisual media [24].
for potential synchronization issues between the olfaction media component and video, hiding some of their negative
effects from the user. These results support and complement the findings in [40].
This concludes our brief overview of our studies in the area of olfaction enhanced multimedia QoE. With an eye to
the future, it is clear that we are only scratching the surface in terms of our understanding the user QoE of IMEx.
There is a significant shortage of research in this area. The delivery of implicit and explicit datasets by the multimedia
community would be very much welcome. In particular, development of such datasets which facilitate analysis to
determine correlations would be very valuable. Considering section 2, it is our belief that we require significantly
more research on the use of psychophysiology-based QoE assessment [20]. In this context, further work to validate
and extend the recommendations we highlighted previously [12] is required. In addition, the type of physiological
sensors employed needs to ensure ecological validity of the data. A collaborative approach is required that
encompasses the multimedia community in addition to HCI, psychology, electronics among many others. Finally since
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 10/56 Vol.13, No.1, January 2018
the range of application domains is so varied, another major challenge from a QoE perspective is how we can address
context based influencing factors which transcends all layers of Fig. 1. A potential approach here may involve the
development of models that estimate or predict QoE.
4. Conclusion
In this article, we have presented a brief overview of our findings with respect to understanding user QoE of olfaction-
enhanced multimedia. We have considered numerous influencing factors as part of QoE evaluations inclusive of
network transmission related effects; human factors and content factors. Understanding user QoE of Olfaction based
applications is non-trivial, and as such we have proposed a number of research challenges for the multimedia
community to consider addressing this research challenge.
Acknowledgement
This work was partly funded by the Irish Research Council New Foundations Scheme.
References
[1] Le Callet, P., Möller, S., and Perkis,A.. 2012. “Qualinet White Paper on Definitions of Quality of Experience“. European
Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003).
[2] Ebrahimi, T. 2009. “Quality of Multimedia Experience: Past, Present and Future”, ACM Multimedia Conference (MM’09), pp.
3-4.
[3] Stankiewicz, R, Cholda, P., Jajszczyk, A. 2011 “QoX: What is it Really?” In IEEE Communications Magazine, vol. 49, no. 4,
pp 148–158.
[4] Reiter, U., Brunnstrom, K., Moor, K. D., Larabi, M-C., Pereira, M., Pinheiro, A., You, J., Zgank, A. 2014. “Factors influencing
Quality of Experience”. In Quality of Experience, Advanced Concepts, Applications and Methods.
[5] Ghinea, G., Ademoye. O. A., 2011. “Olfaction-enhanced multimedia: perspectives and challenges”. In Multimedia Tools and
Applications, vol. 55, no. 3, pp. 601-626
[6] P. T. Kovács, N. Murray, G. Rozinaj, Y. Sulema and R. Rybárová. 2015. "Application of immersive technologies for education:
State of the art," 2015 International Conference on Interactive Mobile Communication Technologies and Learning (IMCL),
Thessaloniki, pp. 283-288, doi: 10.1109/IMCTL.2015.7359604
[7] Murray, N., Qiao, Y., Lee, B., Karunakar, AK, Muntean, G.-M., “Olfaction enhanced multimedia: A survey of application
domains, displays and research challenges". In ACM Computing Surveys 48:4, 2016.
[8] Obrist, M., Tuch, A.N., Hornbaek, K., 2014. “Opportunities for Odor: experiences with smell and implications for
technology” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2843-2852.
doi>10.1145/2556288.2557008
[9] Obrist, M., Vi, C., Ranasinghe, N., Israr, A., Cheok, A., Spence, C., Gopalakrishnakone, P., 2016. “Sensing the Future of HCI:
Touch, Taste, and Smell User Interfaces”. In Interactions, Vol. 23, Issue 5, pp. 40-49, 2016.
[10] Obrist, M., Gatti, E., Maggioni, E., CT Vi, Velasco, C.. “Multisensory Experiences in HCI”. In IEEE Transactions on
Multimedia, Vol. 24, Issue 2, pp. 9-13, June 2017.
[11] Spence, C., Obrist, M., Velasco, C. and Ranasinghe, N. 2017. “Digitizing the chemical senses: Possibilities & Pitfalls”. In
International Journal of Human-Computer Studies, vol. 107, pp. 62-74. https://doi.org/10.1016/j.ijhcs.2017.06.003
[12] Murray, N., Ademoye, O. A., Ghinea, G., Muntean, G.-M. 2017. “A Tutorial for Olfaction-based Multisensorial Media
Application Design and Evaluation” In ACM Computing Surveys (CSUR), Vol. 50, Issue 5, Article No. 67,
doi>10.1145/3108243
[13] ITU-T BT.500. Methodology for the subjective assessment of the quality of television pictures, 2002.
[14] ITU-T P.910. Subjective video quality assessment methods for multimedia applications, 2008.
[15] J. Puig, A. Perkis, F. Lindseth and T. Ebrahimi, "Towards an Efficient Methodology for Evaluation of Quality of Experience in Augmented
Reality," in Quality of Multimedia Experience (QoMEX), 2012.
[16] J. Cha, M. Eid, A. Barghout, A. M. Rahm and A. El Saddik, "HugMe: Synchronous Haptic Teleconferencing," in ACM international
conference on Multimedia, 2009.
[17] A. Drachen, L. E. Nacke, G. Yannakakis and A. L. Pedersen, "Correlation between Heart Rate, Electrodermal Activity and Player Experience,"
in SIGGRAPH Symposium on Video Games, 2010.
[18] T. Hoßfeld, R. Schatz and S. Egger, "SOS: The Mos Is Not Enough!," in Quality of Multimedia Experience (QoMEX), 2011
[19] E. Kroupi, P. Hanhart, J.-S. Lee, M. Rerabek and T. Ebrahimi, "Modeling Immersive Media Experiences by Sensing Impact on Subjects,"
Multimedia Tools and Applications, vol. 75, p. 12409–12429, 2016.
[20] U. Engelke, D. P. Darcy, G. H. Mulliken, S. Bosse, M. G. Martini, S. Arndt, J.- N. Antons, K. Y. Chat, N. Ramzan and K. Brunnström,
"PsychophysiologyBased QoE Assessment: A Survey," IEEE Journal of Selected Topics in Signal Processing, 2017.
[21] Keighrey, C., Flynn, R., Murray, S., Brennan, S., and Murray, N. 2017. "Comparing user QoE of AR and VR applications using physiological
and interaction measurements”. In 25th ACM International Conference on Multimedia (ACM MM 2017), Thematic Workshop, Oct 2017, in Mountain View, CA.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 11/56 Vol.13, No.1, January 2018
[22] Keighrey, C., Flynn, R., Murray, S., and Murray, N. "A QoE Evaluation of Immersive Augmented and Virtual Reality Speech
& Language Assessment Applications". "A QoE evaluation of immersive augmented and virtual reality speech & language
assessment applications," 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, 2017,
pp. 1-6.
doi: 10.1109/QoMEX.2017.7965656.
[23] Ademoye, O. A. and Ghinea. G. 2009. Synchronization of Olfaction-Enhanced Multimedia. IEEE Trans. Multimed., vol. 11,
no. 3, pp. 561–565.
[24] Ghinea, G. and Ademoye, O. A. 2010. “Perceived Synchronization of Olfactory Multimedia”. In IEEE Trans. on SYSTEMS,
MAN, AND CYBERNETICS – PART A: SYSTEMS AND HUMANS vol. 40, issue 4, pp. 657-663.
[25] Ghinea, G., Ademoye, O. A. 2012. “The Sweet Smell of Success: Enhancing Multimedia Applications with Olfaction”, ACM
Transactions on Multimedia Computing, Communications and Applications (TOMM), vol. 8, no. 1, article 2.
[26] Ghinea, G., Ademoye, OA. 2009. “Olfaction-enhanced multimedia: Bad for information recall?” International conference on
Multimedia and Expo (ICME), pp. 970-973.
[27] Ghinea, G. and Ademoye, OA. 2012. “User perception of media content association in olfaction-enhanced multimedia”. In
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 8, no. 4, article 52.
[28] Wilkstrand, G. 2003. “Improving user comprehension and entertainment in wireless streaming media” In Introducing
Cognitive Quality of Service, Department of Computer Science, Umea, Sweden.
[29] Murray, N., Qiao, Y., Lee, B., Karunakar, AK, Muntean, G.-M. 2013. “Subjective Evaluation of Olfactory and Visual Media
Synchronization” In Proceedings of ACM Multimedia Systems conference. Feb 26 - March 1, Oslo, Norway.
[30] Murray, N., Qiao, Y., Lee, B., Muntean, G.-M., Karunakar, AK,. 2013. “Age and Gender Influence on Perceived Olfactory
and Visual Media Synchronization” In Proceedings of IEEE International Conference on Multimedia and Expo (ICME), San
Jose, CA, 2013, pp. 1-6.
doi: 10.1109/ICME.2013.6607467
[31] Murray, N., Lee, B., Qiao Y., and Muntean, G.-M. 2016. "The Influence of Human Factors on Olfaction based Mulsemedia
Quality of Experience" . In 8th International Conference on Quality of Multimedia Experience (QoMEX), Lisbon, 2016, pp. 1-
6.
doi: 10.1109/QoMEX.2016.7498975
[32] Murray, N., Qiao Y., Lee, B., and Muntean, G.-M. 2014. "User-Profile-Based Olfactory and Visual Media
Synchronization". ACM Transactions on Multimedia Computing Communications and Applications (TOMM), vol 10. issue
1s, article 11, doi>10.1145/2540994.
[33] Murray, N., Lee, B.,Qiao, Y., Muntean, G.-M.. 2014. “Multiple-Scent Enhanced Multimedia Synchronization” In ACM
Transactions on Multimedia Computing, Communications, and Applications (TOMM). Volume 11 Issue 1s, September
2014 Article No. 12 doi>10.1145/2637293.
[34] Ademoye, O.A., Murray, N., Muntean, G.-M. and Ghinea, G. 2016. "Audio Masking Effect on Inter-Component Skews in
Olfaction-Enhanced Multimedia Presentations". In ACM Transactions on Multimedia Computing Communications and
Applications (TOMM), vol. 12, issue 4, article 51.
[35] Murray, N., Lee, B., Qiao, Y., Muntean, G.-M., 2017. "The Impact of Scent Type on Olfaction-enhanced Multimedia Quality
of Experience". In IEEE Transactions on Systems, Man, and Cybernetics, vol. 47, issue 9, pp. 2503-2515, doi>
10.1109/TSMC.2016.2531654
[36] Egan, D., Keighrey, C., Barrett, J., Qiao, Y., Brennan, S., Timmerer, C., and Murray, N. "Subjective Evaluation of an Olfaction Enhanced
immersive Virtual Reality Environment" . In Proceedings of the 2nd International Workshop on Multimedia Alternate Realities, in Mountain View, CA., pp. 15-18, 2017, doi>10.1145/3132361.3132363
[37] ISO 5492:2008 Sensory analysis – Vocabulary, International Standards Organization (ISO)
[38] ISO/IEC 8589 Sensory analysis – General guidance for the design of test rooms.
[39] ISO 5496:2006 – Sensory Analysis – Methodology – Initiation and training of assessors in the detection and recognition of odours
[40] B.R. Brkic, A. Chalmers, K. Boulanger, S. Pattanaik, and J. Covington. 2009. “Cross-modal effects of smell on the real-time rendering of
grass” Spring Conference on Computer Graphics (SCCG), Budmerice, Slovakia, pp. 161-166.
Niall Murray is a Lecturer with the Faculty of Engineering and Informatics, in the Athlone Institute
of Technology (AIT), Ireland. He is founder (in 2014) and principal investigator (PI) in the truly
Immersive and Interactive Multimedia Experiences (tIIMEx) research group in AIT. He is a Science
Foundation Ireland (SFI) Funded Investigator (FI) in the Confirm Centre for Smart manufacturing
and an associate PI on the Enterprise Ireland funded Technology Gateway COMAND. His current
research interests include immersive and multisensory multimedia communication and applications,
multimedia signal processing, quality of experience, and wearable sensor systems. He has published
over 40 works in top-level international journals and conferences and book chapters. Further information available at:
www.niallmurray.info
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 12/56 Vol.13, No.1, January 2018
Yuansong Qiao is a senior research fellow and a Science Foundation Ireland funded Investigator
working in the Software Research Institute (SRI) at Athlone Institute of Technology. He received
his Ph.D. in Computer Applied Technology from the Institute of Software, Chinese Academy of
Sciences, Beijing, China, in 2007. He received a BSc and an MSc in Solid Mechanics from Beijing
University of Aeronautics and Astronautics, China in 1996 and 1999 respectively. His research
interests include Information Centric Networking, Software Defined Networking, and networking
support for emerging multimedia and big data systems..
Conor Keighrey is a research candidate in the Athlone Institute of Technology (AIT), Ireland.
He received his BSc. in Computer Network Management and Cloud Infrastructure in 2016 and is
currently in pursuit of his PhD. His current research work focuses on understanding the key
influencing factors that affect quality of experience of emerging immersive multimedia
experiences (Augmented Reality and Virtual Reality).
Débora Pereira Salgado is a research candidate in the Athlone Institute of Technology (AIT),
Ireland. She has a BS. in Biomedical Engineering from the Universidade Federal de Uberlândia in
2017 and is currently in pursuit of his PhD. Her current research work focuses on evaluating the
utility and relationship of various physiological metrics and user quality of experience of emerging
immersive multimedia experiences.
Gabriel-Miro Muntean is an Associate Professor with the Dublin City University (DCU) School
of Electronic Engineering and co-director of the DCU Performance Engineering Laboratory,
Ireland. His research interests include quality, performance, and energy saving issues related to
multimedia and multiple sensorial media delivery, technology-enhanced learning, and other data
communications over heterogeneous networks. He has published more than 300 papers in top-level
international journals and conferences, and has authored three books and 16 book chapters and
edited six additional books. He is an Associate Editor for IEEE Transactions on Broadcasting, an
Editor for the IEEE Communications Surveys & Tutorials, and a reviewer for important international journals,
conferences, and funding agencies. He is project coordinator for the EU-funded project NEWTON
(http://newtonproject.eu).
Kemi Ademoye is a lecturer in computing at University of Wales Trinity Saint David. Her
research interests revolve around the areas of multisensory computing, user interfaces and
human-computer interaction, enterprise data integration and application development processes.
Kemi has an MSc and PhD from Brunel University..
Christian Timmerer is an associate professor in the Department of Information Technology
(ITEC), Multimedia Communication Group (MMC), Alpen-Adria-Universität Klagenfurt, Austria.
His research interests include the immersive multimedia communication, streaming, adaptation, and
Quality of Experience. He was the general chair of WIAMIS 2008 and QoMEX 2013 and has
participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next,
ALICANTE, SocialSensor, and COST IC1003 QUALINET. He also participated in ISO/MPEG
work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH.
He received his PhD in 2006 from the Alpen-Adria-Universität Klagenfurt. In 2012 he co-founded bitmovin.com to
provide professional services around MPEG-DASH.
Gheorghita Ghinea is a Professor of Mulsemedia Computing in the Computer Science
Department at Brunel University, United Kingdom. His research activities lie at the confluence of
Computer Science, Media and Psychology. In particular, his work focuses on the area of building
end-to-end communication systems incorporating user perceptual requirements. He has authored
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 13/56 Vol.13, No.1, January 2018
over 300 articles in peer-reviewed journals and conferences and co-edited two books on Digital Multimedia
Perception and Design, and Multiple Sensorial Media Advance and Applications.
Brian Lee is director of the SRI in the Athlone IT. He joined AIT in August 2009, having previously
been Research manager in LM Ericsson in Ireland where he supervised a team of 20 researchers
investigating solutions in network management for Ericsson’s Operations Support System (OSS) for
mobile and fixed networks. He has over 20 years of experience in research and system design of
network management solutions for large scale telecommunications networks. He has participated in
many national and international research projects and is coordinator for the H2020 project on Cyber
security, Protective (https://protective-h2020.eu/). He holds a PhD from Trinity College Dublin in
the area of policy management applied to charging. His research interests focus on security and self-adaptive software
systems for network management.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 14/56 Vol.13, No.1, January 2018
Psychophysiological Methods for Quality of Experience Research in Virtual Reality
Systems and Applications
Miguel Barreda-Ángeles, Rafael Redondo-Tejedor, Alexandre Pereda-Baños
Eurecat – Technology Centre of Catalonia, Barcelona, Spain
[email protected]; [email protected]; [email protected]
1. Introduction
Virtual Reality (VR) technology has lived an impressive development in recent years. Although the term and
application has been there for decades, few years ago the density of integrated circuits and displays, the computer
performance and the memory capacity began to allow the construction of head mounted devices (HMD) to simulate
virtual spaces. For that, the HMD projects two spatially and temporal coherent video signals to each user´s eye. The
visual system is later responsible of forming a 3D scene by the so called stereopsis neural mechanism, resulting in an
intense immersive experience [1].
Although 3D audio technology is being relegated to a second plane in the VR field, it is lately that the importance of
localizing sound in 3D space is coming to an evidence to have profound immersive experiences. Analogously to the
Human Visual System (HVS), by means of dedicated filters of Head Related Transfer Function (HRTF) for specific
orientations, the ambisonics technology is able to recreate the perceptual auditory illusions in the Human Auditory
System (HAS). The effect turns in listening a sound from any pair of azimuth and elevation by a simply two channel
(stereo) audio streams, also called binaural listening [2].
VR is a transversal technology which spreads beyond the realm of video games, currently its major contributor in
terms of development and market incomes [3]. VR applications can be found in TV and cinema, documentary, medical
systems, education, museums, journalism, modelling, industrial processes or therapeutic treatment, just to name a few.
Technology companies, related to these fields to a greater or lesser extent, and specially the major international
companies able to provide VR-related technologies, are aware of such potentiality and their investments are said to be
raised from $3 billion in 2016 to $6 billion in 2017. Some recent key facts in 2017 are that Facebook dropped to $200
a wireless VR headset, HTC Vive and Samsung Gear glasses have also received a second important upgrade, Adobe
acquired the company Mettle for VR post-production, Apple announced a dedicated VR framework for developers
and acquired the VR plugins suite Dashwood, and Google released the VR glasses Daydream. Several IEC technical
committees (TCs) and their subcommittees (SCs) produce International Standards for hard- and software used in this
domain. For example:
- ISO/IEC JTC 1, the Joint Technical Committee of IEC and the International Organization for Standardization
(ISO), cover standardization for information technology.
- Subcommittee, ISO/IEC JTC 1/SC 24 works on interfaces for information technology-based applications
relating to computer graphics and virtual reality, image processing, environmental data representation,
support for mixed and augmented reality, and interaction with, and visual presentation of information.
- Sensors are vital components of VR technology. IEC TC 47 and its Subcommittees produce Standards for
microelectromechanical systems (MEMS), to ensure that sensors and such systems work reliably and
efficiently.
- The activities of IEC TC 100 contribute to the quality, performance and interoperability of audio, video and
multimedia systems and equipment
- IEC TC 110 covers electronic display devices and certain components, such as dashboard touchscreens in
cars
A key factor holding back the widespread adoption of VR by consumers is the ability of the current VR systems to
provide satisfactory user’s experiences [2]. Compared to more traditional audiovisual systems, VR systems not only
involve additional factors that may negatively impact the experience, but also have contexts of use in which completely
novel issues regarding user’s experience may arise. Optical limitations are present to conceive a coherent monoscopic
video capture from 360 multi-camera systems due to parallax effects [4]. This is even more challenging for
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 15/56 Vol.13, No.1, January 2018
stereoscopic video format, which is probably eventually an ill-posed problem. Thus, image processing algorithms
based on optical flow and depth estimation aim at mitigating artifacts between stitched areas from different video-
streams, which may derive to ghost effects and perspective loses [5]. Analogously the most advanced ambisonics
microphones are compound of multiple capsules (32 for 4th order) and Youtube the major 360 video streaming
platform currently supports 1st order ambisonics. Moreover, reducing front, back and bottom confusion areas is still
challenging for audio processing. Another important limitation in perceptual terms is on the HMD side. The human
FoV for monocular vision is about 170°-175° Field of View (FoV) and most headset devices currently offer about
100°-110° [6]. Averaged pixel density is about 500 ppi which is still not enough, given an averaged human visual
acuity of 5 arc minutes. The Fresnel lenses are mounted probably to achieve thinner thickness for short distance focus
point. But such design suffers from clearly visible concentric circular-shaped artifacts. Among stereoscopic and
context-awareness, depth-from-focus is one of the HVS mechanisms to infer the relative size of objects in 3D scenario.
This last factor is simply not there since the HDM optical systems imply a fixed focus length about 3-4 cm[6]. Some
experimental optical systems allow several focal planes, but still far from the smooth focal transition experienced in
real life [7]. All these technical limitations are finally translated into perceptual incoherence of audio-visual stimuli.
A sound not aligned with its visual source, an object that vanishes or gets distorted as it moves or an impossible focus
perspective can cause brain confusion, fatigue and to the extreme be responsible for unpleasant experiences like
cybersickness.
The development of methods for analyzing user’s Quality of Experience (QoE) in VR systems and applications is
therefore a topic that deserves attention by researchers in the field. The aim of the present article is to provide an
overview of the current state of the art, and so to stress the advantages of such methods as well as the main faced
challenges.
2. Psychophysiological Methods and QoE Research
Psychophysiological methods can be defined as methods to provide information on psychological states of an
individual based on the analysis of his or her physiological responses. They include the measurement of activity
proxies of the peripheral nervous system, such as electrodermal activity (EDA), heart rate (HR) and heart rate
variability (HRV), electromyography (EMG, and, particularly, facial EMG), or respiration rate, as well as the analysis
of activity of the central nervous system, using techniques such as electroencephalography (EEG). Such signals reflect
diverse cognitive and emotional processes that can be useful to understand how the user thinks and feels in a certain
moment. For instance, increases in EDA have been related to experiences of emotional arousal, cognitive effort, or
stress, HRV is considered to reflect parasympathetic nervous activity associated to attentional focus and emotional
regulation, and facial EMG over certain facial muscles is adequate to represent the hedonic valence of the emotions
experienced by the user [5].
QoE is generally considered as a multidimensional concept that can be defined as “the delight of degree or annoyance
of the user of an application or service”, as resulting from “the fulfillment of his or her expectations with respect to
the utility and/or enjoyment of the application or service in the light of the user’s personality and current state” [3].
Research on QoE with immersive systems such as 3DTV has traditionally grounded on self-reported methods such as
psychophysics scales or questionnaires. However, despite the undeniable utility of such methods, in the last years
several researchers have highlighted their limitations, and made the case for the use of psychophysiological methods
[4] for exploring different dimensions of QoE.
Standard evaluations of stereoscopic image quality often rely on the psychophysical scaling methods initially proposed
in ITU recommendations BT1438, or ITU-R BT.500, which allow quantifying specific factors such as the degree of
perceived sharpness, or more general ones, such as overall image quality. Traditional self-reported methods usually
involve asking the user to make a judgment about the quality after watching a certain content (or conducting a certain
task), so forcing him or her to summarize in a single score an experience that happens over time. This involves losing
information on the temporal development of the quality of the experience, and also makes the scores subject to memory
issues and cognitive biases (e.g. the peak-end rule). In the best cases, continuous assessment systems involve the user
giving continuous judgments about the quality, which can overcome some of those limitations, but still require the
user to split his or her attention between the content and the assessment task, and hence may reduce ecological validity
of the test.
By contrast, psychophysiological signals can be collected with a high sampling rate during the experience, making it
easy to synchronize changes in the signals with events in the content or application, and also providing more objective
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 16/56 Vol.13, No.1, January 2018
information of the temporal evolution of the subject’s experience without asking the participant to carry out any extra
effort. Psychophysiological evaluations are also able to capture psychological processes or states even if the user is
not aware of them, and somehow free of some cognitive biases that may affect self-reported methods (e.g. social
desirability). Due to these and other advantages, in the last years psychophysiological methods have begun to become
popular among researchers interested in having a picture of mental processes of the users, especially in areas like
media psychology or user experience research.
Much QoE research with multimedia content has focused on subjective judgments of the visual and auditory perceived
quality. In this context, there are evidences that some psychophysiological measures, specifically some brain potentials
measured through EEG, correlate with visual and auditory quality of the stimuli [6][7]. Particularly,
psychophysiological methods have a great potential for the measurement of visual fatigue and discomfort, which are
usually associated to the presence of visual distortions in stereoscopic contents. Regarding EEG, previous studies
suggest that the power of some frequency bands, such as alpha or beta bands, are related to visual fatigue, and that the
relationship between the power of different bands can be used as a metric for visual fatigue [8].
Other researchers have relied on physiological indicators of their indirect effects, such as changes in motivational or
emotional reactions towards the contents. For instance, low visual quality is associated with changes in the frontal
alpha asymmetry index, an EEG indicator of motivational approach [9]. Also the activity of the peripheral nervous
system, such as EDA or HR, has showed correlation with different levels of visual quality and users’ emotions while
experiencing 3D contents [10][11][12][13].
These antecedents may have an undoubtable utility also in the context of VR-QoE assessment, since many of the
problems of visual distortions found in more traditional stereoscopic contents are also present in VR. Furthermore,
VR involves new sources of distortion, such as video stitching, image freezing due to low rendering rates while moving
the head and general incoherent image formation of the perceived scene, which may cause visual fatigue and
discomfort, i.e. low QoE. However, the more relevant factor of QoE in VR and probably the one more differentiating
from other types of multimedia content, is the occurrence of cybersickness. Perceptual incoherence and long exposures
or simple high sensitivity can provoke symptoms such as headache, disorientation, nausea, eye-strain, sweating, or
vomiting, among others [14]. Although usually measured using self-reported measures, cybersickness has also
observable effects over psychophysiological variables such as delta and beta bands of EEG, heart period, respiration,
or blinking [15][16]. Cybersickness is associated to the presence of illusory self-motion (vection) in the represented
environment [18][19], but its etiology is yet not well understood. In this sense, the more traditional explanation has
been the one provided by sensory conflict theory [20], according to which the mismatch between the sensory cues
provided during illusory movement and the expectations resulting from previous experience causes the symptoms of
cybersickness. Alternatively, postural instability theory [21] argues that the reason behind motion sickness is not
sensory conflict itself, but the inability to achieve effective postural adaptations to conflicting visual, vestibular, and
proprioceptive information. Research has evidenced that body sway predicts the occurrence of cybersickness [22][23],
so providing support to this theory. Research on the user experience of VR can not only benefit from this research,
but also contribute significantly to it.
However one of the main advantages of VR systems compared to traditional systems is the ability to elicit in the user
a feeling of spatial presence, that is, the feeling of “being” in the virtual location. Presence is usually measured through
questionnaires [17], but there are important concerns about the validity and reliability of self-reported methods when
measuring such constructs [18]. An alternative is the use of psychophysiological methods that account for indirect
effects of the feeling of presence. For instance, the feeling of presence is assumed to be related to the elaboration of
mental model of the spatial properties of the environment [19], as well as related to stronger emotional responses to
events in the environment [20]. Thus, activity changes in brain areas related to spatial navigation (measured through
EEG), or more commonly enhanced emotional reactions [21] could be taken as indexes of spatial presence experiences
in VR environments. An interesting approach, in which the advantages provided by the temporal resolution of
psychophysiological measures are evident, consists of focusing on “breaks in presence” (BIP), that is, points in which
the feeling of presence momentary vanishes into attentional orienting responses (probably the reflecting attention
shifts from the virtual to the real world), which leaves a measurable trace on physiological signals such as HR or EDA
[22].
Another important high-level aspect of QoE in which psychophysiological methods can be crucial is in the
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 17/56 Vol.13, No.1, January 2018
measurement of user’s engagement. Although it is generally considered as a key factor in describing the relationship
between users and technology, it is a blurry concept that has been defined in several ways (cf. [23]). Some recent
approaches tend to emphasize its relationship with cognitive, emotional, and motivational processes, such as, for
instance, motivational approach as measured by frontal alpha asymmetry [23]. Interestingly, it has been observed that
the consistency over time of attentional and emotional responses obtained from a small sample, is a good predictor of
behavioral measures of engagement of much larger audiences [24][25]. This open a venue for, grounding on an
operationalization of engagement in terms of between-subjects synchrony in terms of attentional and emotional
responses, broadening the scope of QoE research, by exploring not only how certain visual and auditory distortions
may affect not only to the individual satisfaction (“delight or annoyance with the application or service”), but also to
the users’ engagement in a more general fashion.
3. Future directions and challenges
The ability to provide information of the psychological processes even below the subject’s awareness, without any
voluntary effort from the user and high temporal resolution, makes psychophysiological methods especially suitable
for tackling aspects of QoE in VR environments compared to self-reported methods. Particularly, in experiences that
occur over time, such as videogames, training or psychological treatment, these methods can provide insights on
perceptions of visual and auditory quality, cybersickness degrees, presence feeling, or user engagement. This is
positively achieved barely affecting user’s attention and robust to subject’s memory biases or ability for introspection.
It is certainly an advantage since obviously the manipulation of physical tools for reporting subjective impressions is
complicated while the face is covered and reporting inside the VR world could bias the evaluation.
On the contrary, one limitation of psychophysiological methods is that, in their current state they are informative of
basic psychological constructs (e.g. physiological arousal, attentional effort), but they do not directly reflect higher-
level constructs such as presence, engagement, or satisfaction. A central challenge for researchers in QoE is therefore
to explore psychological models inferred from psychophysiological signals. In this sense, a recent work has
demonstrated that cybersickness can be predicted from changes in physiological indicators such as respiration,
stomach activity, and blinking [15]. However, there are no reports of predictive models of aspects such as feeling of
presence, enjoyment, or satisfaction, which are central to QoE. Some difficulties are: the lack of a prediction ground-
truth, since the operationalization of the psychological states usually relies on self-reported methods (e.g. presence
questionnaires), with the associated problems regarding validity, reliability, and temporal already mentioned. A
variety of application contexts in VR can also difficult to establish direct relationships between low-level
psychological processes and high-level inferences. For instance, the presence of negative emotions and stress is
probably an indication of bad experience for a training application, but may be the key factor in some entertainment
contexts. In order to overcome these issues, a possible alternative could rely on observable behavioral outcomes as a
proxy for the psychological inferences, as for instance choice-related metrics or use time or implicit association tests.
Additionally, the variability of psychophysiological signals makes unlikely that a single model based on
psychophysiological measures can accurately predict subjective aspects of QoE for the users. In this respect, a possible
approach is to focus on finding possible clusters of users with similar response patterns, learning models for each
cluster.
The unique way of experiencing VR contents creates specific challenges for the experimental design of QoE tests.
The fact that the user has the capacity of rotating the head and gazing in any direction can provoke that relevant events
are unnoticed by the user. Considering 6 degrees of freedom, where the user can move around the room, it is even
more critical. This highlights the need for adequate experimental designs aimed to find the right balance between the
user’s free behavior (in order to keep the ecological validity of the test) and the constrains imposed by the instructions
to accomplish the evaluation goals.
The physical characteristics of HMD itself impose some constraints to the type of psychophysiological measurements
can be taken. Thus, the headset hinders the use of EEG and the attachment of EMG face electrodes, hand controllers
may obstruct the placement of EDA sensors usually attached to fingers, or the highly wired environment can frustrate
the user’s movements. On this regard, recent wearable systems for psychophysiological recordings (e.g. Empatica,
Shimmer sensing) are a fundamental step forward to the incorporation of psychophysiological methods to VR-QoE.
A second generation of wireless VR glasses has been announced by the companies Oculus and HTC in 2018. There
exists methods for measuring HR by means of cameras and computer vision techniques (without attaching any
physical sensors) in the context of QoE research [26]. However, approaches based on the use of computer vision
techniques would need to address important challenges, such as that user’s face (whose detection is key in many cases)
is partially occluded by the headset or even the user’s arms. In that respect, although dedicated eye tracking devices
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 18/56 Vol.13, No.1, January 2018
have been widely used in basic research for decades, especially for analyzing task accomplishment, it is not until now
that some companies are recently able to incorporate eye-tracking systems into VR glasses like FOVE or TOBII, and
presumably the most major VR companies will do in a recent future.
4. Conclusion
The challenges proposed to psychophysiology-based QoE research in VR environments not only encourage
researchers to rethink traditional QoE metrics and experimental evaluations, but also to walk further to the boundaries
of QoE as a research field. VR expands the use of audiovisual systems from entertainment to very disparate areas
including psychological treatment, data visualization, rehabilitation, and training, among others. The ‘delight’ or
‘annoyance’ of the users in such disparate contexts present several dimensions far beyond issues of perceptual quality
or visual discomfort, and that may hardly been understood grounding solely on subjective perceptions. Aspects such
as spatial presence, user’s attentional and emotional involvement, cognitive effort, or stress, are the key to explain
how a user senses, feels and thinks experiencing a certain application or service, and psychophysiological-based
methods may be the best option currently available for them to be understood. In combination with more traditional
self-reported approaches, as well as with other multimodal measurements, they can help not only to test how
satisfactory a certain VR system or application can be for some users from a consumer perspective, but also, in general
terms, to understand how a disruptive technology such as VR may impact several aspects of our lives.
References [1] Barnard, S. T. and M. A. Fischler (1987) Stereo Vision. Encyclopedia of Artificial Intelligence, pp. 1083-1090. John Wiley, New York
(USA).
[2] Begault, D. R., and Trejo, L. J. (2000). 3-D sound for Virtual Reality and Multimedia. National Aeronautics and Space Administration, Ames
Research Center.
[3] Virtual Reality (VR) Market Analysis By Device, By Technology, By Component, By Application By Region, And Segment Forecasts, 2014
– 2025, GVR-1-68038-831-2, May 2017.
[4] Jones, J. A., Swan II, J. E., Singh, G., Kolstad, E., & Ellis, S. R. (2008, August). The effects of virtual reality, augmented reality, and motion
parallax on egocentric depth perception. In Proceedings of the 5th symposium on Applied perception in graphics and visualization (pp. 9-14).
ACM.
[5] Anderson, R., Gallup, D., Barron, J. T., Kontkanen, J., Snavely, N., Hernández, C., ... & Seitz, S. M. (2016). Jump: virtual reality
video. ACM Transactions on Graphics (TOG), 35(6), 198.
[6] Guenter, Finch, Drucker, Tan, Snyder “Foveated 3D Graphics”, ACM SIGGRAPH Asia 2012.
[7] A. Maimone, G. Wetzstein, D. Lanman, M. Hirsch, R. Raskar, H. Fuchs "Focus 3D: Compressive Accommodation Display". ACM Trans.
Graph. 2013, presented at ACM SIGGRAPH 2014.
[8] M. Slater, M.V. Sánchez-Vives, “Enhancing our lives with immersive virtual reality”, Frontiers in Robotics and AI, vol. 3, art. 74, 2016.
[9] Perkins Coie & Upload, 2016 augmented and virtual reality survey report, 2016.
[10] K. Brunnström et al., “Qualinet White Paper on Definitions of Quality of Experience,” Mar. 2013, Qualinet White Paper on Definitions of
Quality of Experience, Novi Sad, March12, 2013
[11] U. Engelke et al., “Psychophysiology-based QoE assessment: A survey”, IEEE Journal of Selected Topics in Signal Processing, vol. 11, no.
1, pp 6-21, 2013.
[12] A. Lang, R.F. Potter, P. Bolls, “Where Psychophysiology Meets the Media: Taking the Effects Out of Mass Media Research”, in J. Bryant,
M.B. Oliver (Eds.), Media effects: Advances in theory and research, pp. 185-206, New York, NY, Routledge.
[13] S. Scholler, S. Bosse, M.S., Treder, B. Blankertz, G. Curio, K.R. Muller, T. Wiegand, “Toward a direct measure of video quality perception
using EEG”. IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2619-2629, 2012.
[14] J. N. Antons, R. Schleicher, S. Arndt, S. Moller, A.K. Porbadnigk, G. Curio, “Analyzing speech quality perception using
electroencephalography”. IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 6, pp. 721-731, 2012
[15] C. Chen, J. Wang, K. Li, Q. Wu, H. Wang, Z. Qian, N. Gu, “Assessment visual fatigue of watching 3DTV using EEG power spectral
parameters,” Displays, vol. 35, no. 5, pp. 266–272, 2014.
[16] E. Kroupi, P. Hanhart, J. Lee, M. Rerabek, T. Ebrahimi, “EEG correlates during video quality perception,” in Proc. IEEE 22nd Eur. Signal
Proc. Conf., pp. 2135– 2139, Sept. 2014.
[17] E. Kroupi, P. Hanhart, J.-S. Lee, M. Rerabek, and T. Ebrahimi, “Predicting subjective sensation of reality during multimedia consumption
based on EEG and peripheral physiological signals,” in Proc. Int. Conf. Multimedia and Expo, pp, 1-6, Sept. 2014.
[18] Hettinger, L. J., Berbaum, K. S., Kennedy, R. S., Dunlap, W. P., & Nolan, M. D. (1990). Vection and simulator sickness. Military Psychology,
2(3), 171.
[19] Bonato, F., Bubka, A., & Palmisano, S. (2009). Combined pitch and roll and cybersickness in a virtual environment. Aviation, space, and
environmental medicine, 80(11), 941-945.
[20] Reason, J. T. (1978). Motion sickness adaptation: a neural mismatch model. Journal of the Royal Society of Medicine, 71(11), 819.
[21] Riccio, G. E., & Stoffregen, T. A. (1991). An ecological theory of motion sickness and postural instability. Ecological psychology, 3(3), 195-
240.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 19/56 Vol.13, No.1, January 2018
[22] Merhi, O., Faugloire, E., Flanagan, M., & Stoffregen, T. A. (2007). Motion sickness, console video games, and head-mounted displays. Human
Factors, 49(5), 920-934.
[23] Munafo, J., Diedrick, M., & Stoffregen, T. A. (2016). The virtual reality head-mounted display Oculus Rift induces motion sickness and is
sexist in its effects. Experimental Brain Research, 1-13.
[24] M. Barreda-Ángeles, R. Pépion, E. Bosc, E., P. Le Callet, A. Pereda-Baños, “Exploring the effects of 3D visual discomfort on viewers'
emotions”, in Proc. IEEE Int. Conf. Image Processing, pp. 753-757, Oct. 2014.
[25] M. Barreda-Ángeles, R. Pépion, E. Bosc, E., P. Le Callet, A. Pereda-Baños, “How visual discomfort affects 3DTV viewers' emotional arousal”,
in Proc. 3DTV-CON Conf: The True Vision-Capture, Transmission and Display of 3D Video, pp. 1-4, Jul. 2014.
[26] J. J. LaViola, “A discussion of cybersickness in virtual environments”. ACM SIGCHI Bulletin, vol. 32, no. 1, pp. 47- 56, 2000.
[27] M. S. Dennison, A.Z. Wisti, M. D’Zmura, M, “Use of physiological signals to predict cybersickness”. Displays, vol. 44, pp 42-52, 2016.
[28] Y.Y. Kim, H. J. Kim, E. N. Kim, H. D. Ko, H. T. Kim, “Characteristic changes in the physiological components of cybersickness”.
Psychophysiology, vol 42, no. 5, pp. 616-625, 2005.
[29] Witmer, B. G., M. J. Singer, “Measuring presence in virtual environments: A presence questionnaire”. Presence: Teleoperators and virtual
environments, vol. 7, no. 3, pp. 225-240, 1998.
[30] M. Slater, “How colorful was your day? Why questionnaires cannot assess presence in virtual environments”. Presence: Teleoperators and
Virtual Environments, vol. 13, no. 4, pp. 484-493, 2004.
[31] W. Wirth et al., “A process model of the formation of spatial presence experiences”, Media Psychology, vol. 9, no. 3, pp. 493-525, 2007.
[32] J. Diemer, G.W. Alpers, H.M. Peperkorn, Y. Shiban, Y.,A. Mühlberger, “The impact of perception and presence on emotional reactions: A
review of research in virtual reality”, Frontiers in Psychology, vol. 6, art. 26, Jan. 2015.
[33] T. Baumgartner, L. Valko, M. Esslen, L. Jäncke, L, “Neural correlate of spatial presence in an arousing and noninteractive virtual reality: An
EEG and psychophysiology study”, CyberPsychology & Behavior, vol. 9, no. 1, pp. 30-45, 2006.
[34] B. Liebold, M. Brill, D. Pietschmann, F. Schwab, P. Ohler, “Continuous measurement of breaks in presence: Psychophysiology and orienting
responses”, Media Psychology, vol. 20, no. 3, pp. 477-501, 2017.
[35] I. Arapakis, M. Barreda-Angeles, A. Pereda-Baños, “Interest as a proxy of engagement in news reading: Spectral and entropy analyses of
EEG activity patterns”, IEEE Transactions on Affective Computing (online preprint), 2017.
[36] J.P. Dmochowski, M.A. Bezdek, B.P. Abelson, J.S. Johnson, E.H. Schumacher, L.C. Parra, “Audience preferences are predicted by temporal
reliability of neural processing”, Nature communications, vol. 5, 4567, 2014.
[37] C. Christoforou, S. Christou-Champi, F. Constantinidou, M. Theodorou, “From the eyes and the heart: A novel eye-gaze metric that predicts
video preferences of a large audience”, Frontiers in Psychology, vol. 6, art 579, 2015.
[38] M. Bonomi, M. Barreda-Ángeles, F. Battisti, G. Boato, P. Le Callet, M. Carli, “Towards QoE Estimation of 3D Contents Through Non-
invasive Methods”, in Proc. 3DTV-CON Conf: The True Vision-Capture, Transmission and Display of 3D Video, pp. 1-4, Jul. 2016.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 20/56 Vol.13, No.1, January 2018
Miguel Barreda-Ángeles is a researcher at the Digital Humanities unit at Eurecat – Technology
Centre of Catalonia. He received a Ph.D. in Social Communication from Pompeu Fabra University
(Spain) in 2014. His research interests include psychophysiology and user experience research.
Alexandre Pereda-Baños is a researcher at Eurecat’s Digital Humanities Unit, leading the research
line on perception and cognition. He research’s on the field of User Experience measurement and has
participated in several ICT projects such as 2020 3D media. He obtained a Ph.D. in Cognitive
Neuroscience from the Trinity College Institute of Neuroscience (Dublin, Ireland), and before joining
Barcelona Media, he worked at the Multisensory Research Group in the Universitat Pompeu Fabra
(Barcelona).
Dr. Rafael Redondo Tejedor is a researcher with the Eurecat’s Multimedia Technologies Unit mainly dedicated in
computer vision for virtual reality and interactive systems. He received his PhD in computer vision
from the Instituto de Óptica (CSIC) and Escuela Técnica Superior de Ingenieros de
Telecomunicación (ETSIT) at Universidad Politécnica de Madrid (UPM) in 2007. Later on he has
participated in international European projects for 3D medical image visualization (CSIC, 2009),
camera contrast enhancement (Imatrics, 2010), autofocus evaluation (Visilab University of Castilla
La Mancha, 2011), pollen recognition (Inspiralia 2013) and medical image quality evaluation
(UCLM, 2014). He also received a master degree in Sonology from the University of Pompeu Fabra
(UPF). Among his research fields are vision modeling, image and volumetric coding, time-frequency representations,
medical imaging and pattern classification.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 21/56 Vol.13, No.1, January 2018
QoE Concerns and Measurement in Augmented Reality Applications
Patrick Seeling
Department of Computer Science, Central Michigan University, MI, USA
1. QoE and Augmented Reality
Recent years have seen an emergence of a multitude of new Human-Computer Interface (HCI) types, especially in the
domain of wearable devices. Industry predictions, such as Gartner’s Hype Cycle, indicate that we will soon witness
the broad adaptation of Augmented Reality (AR) devices in the consumer and professional spaces, including industrial,
governmental, and military applications [1]. As already witnessed in today’s networks, the presentation of multimedia
content in fixed and mobile scenarios accounts for a significant portion of the overall network traffic. Industry
predictions indicate that this trend is highly likely to continue in the foreseeable future [2]. Jointly, these trends indicate
that a significant portion of future multimedia network traffic will be directed at content presentation in AR scenarios.
In addition to the typical quality and size trade-offs required for the timely display of traditional network-delivered
multimedia content, AR scenarios will likely require adaptations, including multidimensional/immersive views [3],
[4]. A quantification of the impact of these trade-offs generally is achieved by determining objective fidelity metrics
(Quality of Service, QoS) and by mapping them to subjective experience ratings (Quality of Experience, QoE).
Employing the QoE rather than QoS metrics alone has the inherent benefit of enabling network and content service
providers with the means of fine-tuning their offerings to customer expectations and, ultimately, willingness to pay.
The QoE is commonly determined using the Experience Sampling Method (ESM), e.g., using the NASA-TLX
approach [5], captured using Likert-type scales with individual subjects and combined into Mean Opinion Scores
(MOS), see, e.g., [6]. However, this active human in-the-loop approach is not feasible in applied scenarios and
mappings between the objectively determinable QoS and subjective QoE have emerged, such as the IQX Hypothesis
[7] or the Weber-Fechner Law in [8].
The AR environment, however, presents additional challenges. First, the content presentation in AR scenarios
commonly is performed using head-mounted devices (HMDs) to display content, which results in content displayed
close to the eye. Secondly, the presentation is overlapped with the real world, which results in an ad-hoc environment
without significant potential for ex-ante estimations. Intuitively, considerations that need to be taken into account
when determining the QoE in AR scenarios include the media fidelity in addition to contrast and colors as consequence
of the overlay of content with the real world [9]. The combination of both is typically neither readily determinable nor
steady and requires new considerations for perceptual models [10], [11].
Initial evaluations that strive to determine the AR QoE in steady environments for popular test images and video
sequences can be found in [12], [13]. Specifically for still images, we illustrate the difference between the traditional
(opaque) and AR (see-through) mode in Figure 1. As described in greater detail in [12], the effects of image
compression (QoS) and mean opinion scores (MOS, QoE) were denoted as Visual User Experience Difference
(VUED). Interestingly, higher ratings are attained in the AR display for higher qualities, while lower qualities exhibit
a reverse trend. For the two presentation modes, a model was presented that can be applied for estimations of the QoE
in AR settings, based on traditional display modes in a fairly steady environment, resulting even in predictability [14].
A remaining shortcoming for practical real-time evaluations, however, is the remaining active role of the human in-
the-loop that is required to determine the QoE interactively, especially in dynamic environments.
2. Measuring the QoE with EEG
During the same time frame as AR emerged, another form of HCI approaches began to garner interest from the
research community. Brain-Computer Interfaces (BCI) as a subset of HCI employ electroencephalography (EEG) to
measure brainwaves at several positions and derive further information from the different frequency bands at different
localities. While wet electrodes were common in the beginning, we now have reached a point in time where dry
electrodes can readily be placed on a human subject and provide information – all through commercially available
off-the-shelf devices.
In past research efforts, media quality was evaluated in the context of cognitive processes [15] in laboratory settings
with wet electrodes and continue to date [16]. Typically, EEG measurements at 300 to 500 ms after the stimulus,
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 22/56 Vol.13, No.1, January 2018
Figure 2: Visual User Experience Difference (VUED) based on mean opinion scores different images presented traditionally and in AR. Please refer to [12] for additional details.
such as media display or quality changes. Approaches moving to dry electrodes are beginning to emerge in order to
determine the QoE in general [17]. The potential for a direct measurement has successfully been exploited in
traditional settings, even with dry electrodes, see, e.g., [18], [19].
Jointly with the commonly head-worn binocular vision augmenting devices, a new opportunity in determining the
QoE of device operators emerges. Specifically, little modifications of current AR devices could provide real-time or
close to real-time EEG measurements, as the physical contact of additional sensors on device wearers can be readily
realized with the head mounted device itself. We employed this approach in our own research, evaluating the
possibility of predicting traditional image display (AR) and spherical/immersive image display (SAR) QoE with a 4-
electrode headband. We illustrate some high-level results for the Mean Absolute Error (MAE) in Figure 2.
Corroborating intuition, the combination of all electrodes in the machine learning based prediction approach yields
the lowest overall errors. Simplifications by omitting individual sensors, however, maintain a fairly high level of
accuracy with a configuration that could readily incorporated into future HMD for AR. We refer the interested reader
to [20] for more details and overview of the publicly available data set.
3. QoE Pasa?
BCI with EEG seems poised to emerge in the realm of wearable devices as a future means of directly determining the
QoE from human subjects as they perform actions in the real world. However, other domains of multimedia content
presentation, such as Virtual Reality (VR), are ideal application scenarios as well. The head-worn nature of the devices
presents a unique opportunity to capture psycho-physiological sensor data from the device operator directly. This
enables a passive human in-the-loop approach in the determination of the QoE and subsequent service adjustment.
Consider future closed feedback loop scenarios that allow a passive human in-the-loop evaluation of the QoE through
EEG feedback loops. In turn, multimedia content delivery can be radically changed based on the directly determined
QoE without human subject interventions, but tailored to the situation and the individual subject.
While this is certainly an allure for personalized media services, significant challenges remain, even outside of the
BCI domain for a successful future implementation. The delivery of context-dependent network-delivered content to
(a) AR
(b) SAR
Figure 1: Mean Absolute Errors (MAE) for regular (AR) and spherical (SAR) images with averages and standard deviations for subject ratings
(QoE) and impairment level (QoS) prediction performance analysis of individual subjects.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 23/56 Vol.13, No.1, January 2018
devices in near real-time, however, represents a challenge and requires new paradigm considerations, especially
bandwidth and latency in access networks and clouds [21]. These new extreme low-latency services are currently also
referred to as the “tactile internet” - which brings an additional dimension for the future of QoE research.
References
[1] R. P. Spicer, S. M. Russell, and E. S. Rosenberg, “The mixed reality of things: emerging challenges for human-information interaction,” Proc.
SPIE Volume 10207, Next-Generation Analyst V, pp. 10207, May 2017.
[2] Cisco, Inc., “Cisco visual networking index: Forecast and methodology, 2016–2021,” Cisco Tech. Rep., Jun. 2017.
[3] C. Ozcinar, E. Ekmekcioglu, J. Calic, and A. Kondoz, “Adaptive delivery of immersive 3d multi-view video over the internet,” Multimedia
Tools and Applications, vol. 75, no. 20, pp. 12 431–12 461, Oct. 2016.
[4] L. A. da Silva Cruz, M. Cordina, C. J. Debono, and P. A. A. Assuncao, “Quality monitor for 3-d video over hybrid broadcast networks,” IEEE
Transactions on Broadcasting, vol. 62, no. 4, pp. 785–799, Oct. 2016.
[5] S. G. Hart, “Nasa-task load index (nasa-tlx); 20 years later,” in Proc. of the human factors and ergonomics society annual meeting, vol. 50, no.
9, pp. 904-908, Oct. 2006.
[6] J. M. Hektner, J. A. Schmidt, and M. Csikszentmihalyi, Experience sampling method: Measuring the quality of everyday life. Sage, 2007.
[7] M. Fiedler, T. Hossfeld, and P. Tran-Gia, “A generic quantitative relationship between quality of experience and quality of service,” IEEE
Network, vol. 24, no. 2, pp. 36–41, March/April 2010.
[8] P. Reichl, B. Tuffin, and R. Schatz, “Logarithmic laws in service quality perception: where microeconomics meets psychophysics and quality
of experience,” Telecommunication Systems, vol. 52, no. 2, pp. 587–600, Feb. 2013.
[9] M. A. Livingston, J. H. Barrow, and C. M. Sibley, “Quantification of Contrast Sensitivity and Color Perception using Head-worn Augmented
Reality Displays,” in Proc. of the IEEE Virtual Reality Conference (VR), Lafayette, LA, USA, Mar. 2009, pp. 115–122.
[10] Y. Jang and W. Woo, “Unified Visual Perception Model for context–aware wearable AR,” Proc. of the IEEE International Symposium on
Mixed and Augmented Reality (ISMAR), Adelaide, SA, Australia, Oct. 2013, pp. 1–4.
[11] E. Kruijff, J. E. Swan II, and S. Feiner, “Perceptual issues in augmented reality revisited,” in Proc. of IEEE and ACM International Symposium
on Mixed and Augmented Reality (ISMAR), Seoul, Korea, Oct. 2010, pp. 3–12.
[12] P. Seeling, “Visual user experience difference: Image compression impacts on the quality of experience in augmented binocular vision,” in
Proc. of IEEE Consumer Communications and Networking Conference (CCNC), Las Vegas, NV, USA, Jan. 2016, pp. 931–936.
[13] ——, “Towards quality of experience determination for video in augmented binocular vision scenarios,” Signal Processing: Image
Communication, vol. 33, no. 0, pp. 41–50, Apr. 2015.
[14] B. Bauman and P. Seeling, “Towards still image experience predictions in augmented vision settings,” in Proc. of the IEEE Consumer
Communications and Networking Conference (CCNC), Las Vegas, NV, USA, Jan. 2017, pp. 1–6.
[15] S. Scholler, S. Bosse, M. S. Treder, B. Blankertz, G. Curio, K.-R. Mueller, and T. Wiegand, “Toward a Direct Measure of Video Quality
Perception Using EEG,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2619–2629, May 2012.
[16] S. Bosse, K. R. Muller, T. Wiegand, and W. Samek, “Brain-computer interfacing for multimedia quality assessment,” in 2016 IEEE
International Conference on Systems, Man, and Cybernetics (SMC), Oct 2016, pp. 002834–002839.
[17] A.-N. Moldovan, I. Ghergulescu, S. Weibelzahl, and C. H. Muntean, “User-centered eeg-based multimedia quality assessment,” in 2013 IEEE
International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), London, UK, June 2013, pp. 1-8.
[18] P. Davis, C. D. Creusere, and J. Kroger, “The effect of perceptual video quality on EEG power distribution,” in 2016 IEEE International
Conference on Image Processing (ICIP), Phoenix, AZ, USA, Sept. 2016, pp. 2420–2424.
[19] P. Arnau-Gonzalez, T. Althobaiti, S. Katsigiannis, and N. Ramzan, “Perceptual video quality evaluation by means of physiological signals,”
in 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, Germany, May 2017, pp. 1–6.
[20] B. Bauman and P. Seeling, “Visual interface evaluation for wearables datasets: Predicting the subjective augmented vision image qoe and
qos,” Future Internet, vol. 9, no. 3, p. 40, Jul. 2017.
[21] Y. Y. Shih, W. H. Chung, A. C. Pang, T. C. Chiu, and H. Y. Wei, “Enabling low-latency applications in fog-radio access networks,” IEEE
Network, vol. 31, no. 1, pp. 52–58, January 2017.
Patrick Seeling is an Associate Professor with the Department of Computer Science at Central
Michigan University, USA. He received his Ph.D. degree from Arizona State University in 2005. His
research interests include user experiences in mixed realities, mobile systems and networking, and
engineering education.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 24/56 Vol.13, No.1, January 2018
Emerging levels of immersive experience in MPEG-I video coding
Dragorad Milovanovic, Dragan Kukolj
Dept. of Computer Engineering, Faculty of Engineering, University of Novi Sad, Serbia
1. Introduction
This article provides a summary of the current activities related to the MPEG development and research activities of
the emerging immersive media technologies (UHD high-resolution 4/8K video and HDR&WCG 10/12bpp picture
quality, 3D video formats, 360° panorama video, Augmented/Mixed/Virtual Reality, immersive games, Light Field
displays, plenoptic imaging, multi-sensorial media) [1]. Especially we outline the need for perceptual tools in
exploration project MPEG-I Coded representation of immersive media towards specification the new standard
ISO/IEC 23090 [2]. Most of these standards activities are currently in early phase. For each use case of immersive
video technology, we consider the supported features which determine the attained level of immersiveness. This level
is different among the considered technologies, and varies with technological complexity [3]. In some of the cases,
the technology is anticipated to be available in the not so close future. This fact is one of the motivations of works in
ISO/IEC MPEG group on MPEG-I project, which aims at standardization of immersive visual media in phases.
2. Levels of immersion and technological complexity
In June 2016, MPEG started working on MPEG-VR initiative to develop a roadmap and coordinate the various
activities related to immersive media within MPEG and to liaison also with other consortia working on innovative
products and services. Currently, MPEG-I project explores standards to digitally represent immersive media. The first
stage of MPEG-I Phase 1A, target the most urgent market needs, which is specification of 360 video projection
formats OMAF (Omnidirectional Media Application Format). The next Phase 1B (Doc.N17069 Requirements on
Phase 1B, July 2017) will the extend specification towards 3DoF+ applications. The Phase 2 (Doc.N17073
Requirements on 6DoF v1, July 2017) is intended to start from about 2019, aims at addressing 6DoF applications like
free viewpoint video (Table 1). Different levels of experience can be achieved by the user who may freely move his
head around three rotational axes 3DoF (yaw, pitch, roll), and along three translational directions 6DoF (left/right,
forward/backward, up/down) (Doc.W17285 Visual activities on 6DoF and Light Fields, Oct. 2017).
Table 1. MPEG-I features and technology in exploration.
Use case Technology Phase
3DoF
360° video
This phase aims to deliver a complete distribution system: Basic 360° streaming (possible
optimized Tiled Streaming with adequate support in HEVC ) and projection (monoscopic and
stereoscopic). OMAF (ISO/IEC 23090-2 Application format for omnidirectional media, 2017)
Phase1A
Oct. 2017
3DoF+
360° video
Focus on VR 360° with 3DoF, with some additional depth clues, that would allow moving the
viewpoint in a limited space. In addition, optimization in projection mapping, further motion-to-
photon delay reductions, optimizations for person-to-person communications as well as the
phase should have some quality definition and verification. FVC (Future Video Codec)
baseline (Doc.N17195 Joint Call for Proposals on Video Compression with capability beyond
HEVC, Oct. 2017). PCC (Doc.N17251 Report on Point Cloud Compression Call for Proposals,
Oct. 2017).
Phase1B
2019
6DoF
WindowedVR
Most important element new video codec with support for 6 DoF. Systems elements required in
support of 6DoF, as well as 3D graphics. Support for interaction with the virtual environment
(Doc.W17130 Exploration experiments: Windowed-6DoF, Oct. 2017).
Phase 2
2020
Augmented
reality
The additional requirements come from the fact that the renderer has to be aware of the real
environment. Dense sampling of 3D space with video signals. Metadata can specify aspects of
environment. AV signals can be ultra-realistic and augmented.
Hybrid Natural/Synthetic Scene Container (Doc.N17064 Requirements for MPEG-I hybrid
natural/synthetic scene data container v1, July 2017).
2021
Omnidirectional 360° video provides immersive experiences based on interactivity between the user and the content.
However, the market fragmentation due to lack of appropriate standards on storage and delivery format for such
content is becoming one of the strong concerns by the industry. A first set of MPEG-I specifications is required in
time for a market launch of products and services in 2018. It is highly likely that MPEG can deliver solutions that are
optimized in a longer time frame, which requires for more experimentations and development. Since many believe
that major market launch of VR 360° services will happen in 2020, a next set of specifications can be delivered in
2019. At the same time it is clear that there is a strong need for longer term work, notably in the video area, but
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 25/56 Vol.13, No.1, January 2018
possibly also in the audio space on 6DoF content.
3. The need for perceptual tools and assessment
In order to immerse the user into virtual reality, the MPEG-I technology has to convince our senses. The most basic
requirement is to present vision to the eyes of the viewer. The level of immersion can be increased if the following
features are implemented:
✓ Rotation. The ability to look around freely with 3DoF (yaw, pitch, roll) allows human brain to construct holistic
model of the environment. This process is crucial to provide true immersive experience. Therefore, the views
presented to the user’s eyes should follow rotation of the head.
✓ Motion. The ability of the user to move in all directions 3DoF improves the level of immersion in two ways. First,
it enables motion parallax, which helps brain to perceive the depth and cope with occlusions. Second, it allows the
user to explore.
✓ Joint rotation and motion. Both of these features together grant the user with 6 degrees of freedom. Thanks to
the synergy, the user can witness the presented reality without bounds.
✓ Latency. Our brains are very vulnerable to the differences in time of perception of information coming from
different senses. The mismatch causes motion sickness, which can be avoided by minimizing overall latency of
the system.
✓ Binocular vision. The human visual system employs information from both of eyes to sense the depth of the scene.
Without the proper depth sensation, the scene is perceived as flat and unnatural.
✓ Resolution. For the example of HMDs, the displays are mounted very close to the user’s eyes and therefore the
amount of pixels must be sufficient in order to avoid aliasing. This is in particular important when the user moves
or rotates very slightly, which results in unnatural jumps of edges in the perceived image by single pixel distance.
✓ Self-embodiment. The ability to see parts of its own body convinces the user about being part of the presented
reality.
✓ Interactivity. Allowing the user to manipulate objects provides strong premises about integrity of the presented
reality.
In order to study immersive experience metrics and their measurability in immersive services, and develop a test
methodology of the technical specification for the intended use cases, MPEG-I is calling for video test material to
assess algorithm performance for different setups where information is combined from different cameras to generate
virtual views scene (Doc.N16766 Call for immersive visual test material, April 2017). Test material should comply
to the attributes as follow:
✓ General considerations. Still image and video sequences from both indoor and outdoor scenes can be submitted,
with sufficient complexity to test the limits of the algorithms under study - natural content is highly preferred over
computer-generated content. Color components, depth, and metadata are provided separately (particular for the
camera parameters). Types of cameras and camera array arrangements (highly dense array of images along a
predefined track: 2D linear with parallel cameras, 2D linear with convergent cameras, 2D cylindrical surface, 2D
spherical surface). Accurate temporal synchronization of multiple cameras is preferred.
✓ Omnidirectional video with depth data. The content should be captured with an arrangement of cameras that
records divergent views, preferably in an arrangement that supports the capture of a full 360° field of view. Both
the texture and depth data must be provided at the same resolution with an input greater than or equal to 4K, and
the same projection - preferably in the equirectangular projection.
✓ Divergent/convergent camera arrangement. Video material recorded with significant overlap preferably in an
arrangement that supports the capture of a full 360° field of view / volume of visual data. Both the intrinsic and
extrinsic camera parameters must also be provided.
✓ 2D camera array arrangement (following a planar, cylindrical or spherical surface). Dense video sequences are
particularly sought with a baseline distance between cameras not more than 20cm, and the distance from one end
of the array to the other end as wide as possible.
✓ Plenoptic cameras. Density of micro-lenses supposed to be large enough to ensure a good angular sampling of
the light field. Resolution of the plenoptic image should be no less than 15 mega-rays.
✓ Systems of simultaneous multiple acquisitions. Simultaneously acquire the same scene following the
specifications defined above.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 26/56 Vol.13, No.1, January 2018
Recently, MPEG-I established an ad-hoc group Immersive media quality evaluation with the goal to document
requirements for QoE in Phase 1B, collect test material, study existing methods for QoE assessment (Doc.W17135
Survey on assessing subjective quality of immersive media applications and services, Oct. 2017), study immersive
experience metrics (Doc.W17239 Immersive media metrics under considerations, Oct. 2017) and their measurability
in immersive services, and develop a test methodology (Doc.M40814 VR Experience metrics, July 2017). MPEG-I
CD Part 6 Jan. 2018 will specify immersive media metrics and measurement framework to enhance the immersive
media quality and experiences. This part also includes a client reference model with observation and measurement
points to define the interfaces for the collection of the metrics.
4. Conclusion
It can be summarized that the MPEG-I project develops technologies with various levels of immersiveness. The level
is different among the considered technologies, and varies with technological complexity. MPEG technically finalized
the first international standard for delivery and storage immersive media in OMAF (Omnidirectional MediA Format).
Next, the MPEG meeting in Oct. 2017 marked the first major step toward the FVC (Future Video Coding) standard
in the form of a joint call for proposals, which includes the testing of technology for 360° omnidirectional video
coding. Also, MPEG evaluates responses to call for proposals for Point Cloud Compression (PCC) and kicks off its
technical work in lossless or lossy coding of extremely large amounts of 3D data with applications in immersive real-
time communication and 6DoF virtual reality.
In order to study immersive experience metrics, their measurability in immersive services, and develop a test
methodology of the technical specification for the intended use cases, MPEG-I establish an ad-hoc group. The
mandate of the working group is to specify immersive media metrics and measurement framework to enhance the
immersive media quality and experiences in draft standard till January 2018.
References
[22] D.Milovanovic, D.Kukolj, "Recent advances in UHD video coding technology: High Dynamic Range and Wide Color Gamut", IEEE
COMSOC MMTC Communications - Frontiers, Special issue on Ultra-high definition video communications (Guest editors: P.A.Assuncao,
R.Vanam), Vol.11, No.1, Jan. 2016. pp.50-55.
[23] ISO/IEC JTC1/SC29 23090 Coded Representation of Immersive Media, Part 1 Technical report on immersive media, Part 2 Application
format for omnidirectional media, Part 3 Immersive video, Part 4 Immersive audio, Part 5 Point cloud compression, Part 6 Immersive media
metrics, (under development / exploration).
[24] Q.Huynh-Thu, P.Le Callet, M.Barkowsky, "Video quality assessment: From 2D to 3D - Challenges and future trends", in Proc. 17th IEEE
ICIP, Sep 2010, Hong Kong SAR China. pp.4025 – 4028.
Dragorad Milovanovic received the Dipl. Electr. Eng. and M.Sc. degree from Faculty of
Electrical Engineering, University of Belgrade, Serbia. He has working as research assistant in
DSP, R&D engineer in multimedia communications and PhD researcher. He has coauthored
reference-books for PH, Wiley and CRC Press as well as published more than 200 papers in
international journals and conference proceedings.
Dragan Kukolj (M’97-SM’06) received his Diploma degree in control engineering in 1982, MSc
degree in computer engineering in 1988, and PhD degree in control engineering in 1993, all from
the University of Novi Sad, Serbia. He is currently a Professor of computer-based systems with
Dept. of Computer Engineering, Faculty of Engineering, University of Novi Sad. His main
research interests include digital signal processing, video processing and machine learning. He
has published over 200 papers in referred journals and conference proceedings. Dr. Kukolj is the
coordinator of Intellectual Property Centre of University of Novi Sad.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 27/56 Vol.13, No.1, January 2018
Trends in QoE for immersive experiences
Andrew Perkis, Sebastian Arndt
Department of Electronic Systems
The Norwegian University of Science and Technology, Trondheim, Norway
[email protected], [email protected]
1. Introduction
Digital storytelling is at the heart of new digital media and the ability to tell stories in various formats for multiple
platforms is becoming increasingly important. The drive today is towards creating immersive and interactive digital
stories for a diversity of services and applications, spanning from pure entertainment through art, learning and training
towards edutainment and advertising. Sensor based digital storytelling targets the interactivity and feedback in a story
enabling the shift from a passive user to an active and engaged participant feeling a higher degree of affiliation to the
content by triggering more of their senses.
In traditional storytelling, the viewer always follows a structured and logical path through the story. This is known
as linear storytelling. In immersive and interactive stories on the other hand, the participants can interact with the
content and make their own path through the story. This is known as non-linear storytelling which we will refer to as
sensor based digital storytelling. Since the objective of sensor based digital storytelling is to create more immersive
narratives, it is important to notice that the more the sensor based digital story triggers all of our senses, the more
immersed the participants will be and the more presence they will feel in the story. Recent years have seen an extreme
growth in new technologies, such as the Head Mounted Displays (HMD) for Augmented Reality (AR) and Virtual
Reality (VR), which have immense possibilities of new and more immersive and interactive content. Also, the
evolution in technologies like motion tracking and haptic technologies has given new and interesting possibilities.
However, so far, the technologies lack content and applications in order to make us think ‘wow - this is it!’. The search
for these new applications allowing more immersive and interactive content has led the way to a new field of research
within immersive narratives and content creation. Sensor based digital storytelling is a part of this with the concept of
using sensors, visuals and audio to allow interactivity, multi-sensory stimuli and non-linear storylines.
In immersive and interactive content, it is important to engage the user in the best way, by using appropriate sensors
and communicate the interaction in a natural way having a good understanding of Human Computer Interaction and
being able to model and assess the quality of their experiences. Previously it has been common to measure the user’s
quality based on the Quality of Service (QoS) of the system, e.g. only measuring the delivery of the content, making
sure the delivery is error free and as such optimizing the sense of realness of the story, that the digital elements are
received as intended by the transmitter. In some cases, it is a good measure, but for sensor based digital storytelling
and many other modern applications it is not. Even though the system has high accuracy, low delay and is very stable,
meaning that the QoS is high, the end user might experience low Quality of Experience (QoE). For interactive and
immersive content, we should be optimizing the sense of being there rather than the sense of realness. The ICT and
digital media industry is embracing this and is currently changing towards becoming user centric attempting to
optimize the users’ QoE. Convergence of the media and ICT industry has led to a paradigm shift away from using the
network centric QoS as quality measure for networked media handling towards putting into place a quality measure
covering the end to end points in the multimedia system as well as considering context. This drive gives rise to the
problem of defining, modelling and measuring the QoE of immersive experiences.
2. Sensor based digital storytelling
The creative and media industry is all about content and its users consuming ever richer digital media on a plethora
of different devices over various networks. Often the situation of the user can be felt as in Figure 1. Digital storytelling
is all about enhancing the QoE in this rich digital media changing the viewer into a participating, engaged and
immersed user. The methods vary and include adding interactivity, increasing dimension, mixing realities all the way
through to creating content that triggers more senses. All this adds to the complexity and understanding of digital
storytelling.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 28/56 Vol. 13, No. 1, January 2018
A narrative starts as a creative process as an idea in the mind of someone. For the narrative to be usable, it has to be
transcribed into a form that can be eventually turned into an objective digital form. This process requires a trans-
disciplinary skill set and a set of tools. The design usually requires the use of electronics, sensors, actuators and tools
in the form of hardware and software. For the users, the final experience is concerned with issues such as visualization
and user interfaces. There are a lot of existing current practices to this which typically vary from business to business
in the creative and media industry such as publishing, news media, broadcasting, movies, gaming etc. The hardware
available for the users is also ever changing and in constant development, with the mobile devices on a strong rise and
the current trend of head mounted displays and rage about mixed realities. The quest is to create immersive and
interactive content the market is willing to pay for.
Figure 1. The immersive user.
Figure 1 provides a concept of an immersive user today, surrounded by her devices and constant digital expressions
all forming a sensor based digital story throughout the day. The immersive experience differs significantly from a
media experience, especially as the context is important for the user, however, the context often remains unknown for
the creator, content owner and provider. This is also true for the network conditions and device capabilities at the
user’s end. Another problem is differentiating between the technical quality assessment, measuring degradations due
to system parameters such as capture, media processing and network conditions as opposed to the actual aesthetic
quality intended by the creator. Immersive and interactive media supports natural interactions between people and
their environment. The media considered still consist of audio and visual presentations enriched by interactivity by
user interactions including traditional interactivity as well as novel methods such as haptics and explore use of other
media such as olfactory and taste. The ultimate goals are to digitally create real world presence and a sense of being
there as a measure of immersion in an Immersive Media Technology Experiences (IMTE). IMTE is a concept
incorporating several disciplines including Media Technology, Information and Communication Technology, and
Media Studies encompassing diverse core competencies covering fields such as communications, information
retrieval, entertainment and social networks.
There is often a misconception that this is new, which it is not. Some elements have been around for more than 50
years, such as Virtual reality, while other are based on long known principles such as transmedia storytelling.
3. QoE for immersive experiences
In order to find a measure for the user’s perceived quality of the received media presentation, we have been active
in the development shifting from using simple QoS as a measure of the quality to the broader concept of QoE. More
recently the definitions of QoE have been driven by the media processing and delivery community with close links to
other fields such as Psychology and social sciences. A formal definition is given in the Qualinet White paper published
in 2012 [1].
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 29/56 Vol. 13, No. 1, January 2018
QoE assessment and modelling of immersive experience is multi-faceted problem touching upon many of the
intangible features of human experience which are rather difficult to sense, capture, interpret and/or interact with, let
alone its quality assessment and modelling. It is, however, of great importance because immersion is a major
psychological mechanism in media enjoyment which, if properly unveiled, can lead to significant improvement and
innovation in the value creation of media production and consumption.
Zhang et al. [2] propose a framework which aims to measure immersive experience from the QoE perspectives of
human factors, system factors and design factors. In human factors, Perrin et al. [3] predict and measure sense of
presence using subjective QoE measurements such as neuropsychological and physiological signals (EEG, ECG and
respiration). Using neurophysiological measures, user experiences can be measured less obtrusive. In addition, Antons
et al. Error! Reference source not found. showed that brain responses to quality reductions in some cases can be
more sensitive than behavioral data is. Redi et al. [5] discuss and evaluate the QoE of emerging display technologies
and AR/VR applications in immersive viewing experience. In design factors, Mansilla and Perkis [6] discuss and
evaluate the measurement of implicitly activated QoE judgment in storytelling and design, such as sensation
transference, thin slicing and priming effects.
These QoE assessment methods of immersive experiences reflect the fact that immersion is a multi-dimensional
construct and any attempts to measure and evaluate it must be implemented both at the technical level (i.e. system
factors) and/or by neuro-psycho-physiological means (i.e. human factors), and any mediation and interaction between
them (i.e. contextual factors). The methodology needs to be further advanced by including new elements such as
design factors, experiential factors and media factors, to foreground the unique sensory, perceptual and affective
experiences that are brought forth by an immersive experience.
3. Physiological measures for QoE
The advance of ever better and cheaper sensors makes it possible for the end user to purchase sensors that can measure
physiological parameters. Simple measuring devices, such as heart-rate monitors are already deeply integrated in many
wearables, and more sensors are about to become more popular. This makes it possible to use the sensors in two
different ways; either using them as an additional input for multimedia systems, meaning another contextual variable,
or using them as a feedback measure, measuring the users’ response to certain content.
In the first case, these kinds of measures give the creative industry the chance to use physiological signals as an
additional input, such that the user immerses even more into the story. This could include estimating the current
physical or emotional state, or the change of it, and using this information to offer even more suitable content for the
current situation which is not only based on time and/or location, but in addition on how the user is currently
feeling/how the current situation of the user is. On the other hand, it also could offer hints to the user, that they should
take a break now, because their cognitive capacity is decreasing, and thus giving the user a better experience in the
end.
Secondly, physiological measures can also be used to estimate the perceived level of QoE or how the users’ state is
changing during perception of different multimedia contents. Thus, not asking for a subjective feedback at the end of
the session, as done at the moment, but using the applied sensors for an estimate on how the state has been changing
during the use of the service. Measures of brain waves, i.e. electroencephalography, have shown that the cognitive
state is changing when users are exposed to longer low quality multimedia sequences, implying that the user is
becoming more fatigued Error! Reference source not found.. Furthermore, simpler measures such as e.g. heart rate
variability can give an indicator of the stress level for a user.
4. Use cases
4.1 Adressaparken
The results from quality assessments in multimedia communications let us extend our work moving into new digital
media enabling more immersive experiences. Our fist use case focuses on using lights, audio and visual presentations
and their interactions through sensor networks in public spaces. By creating our own content in the form of interactive
art installations, we are able to experiment on new ways of modeling and assessing QoE expanding the range from
pure audiovisual content to immersive and interactive content. Learning from the culture of the counter-establishment
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 30/56 Vol. 13, No. 1, January 2018
and from remix innovations in art and digital media —often marked by a reframing of existing narratives from
alternative and innovative perspectives— we invoke the idea of a multiuse play space. The idea is to exploit a public
space facility to explore technological infrastructures and mobile materials that can be moved, combined, taken apart
and placed back together, and “placemaking” community interactions that can be reused and redesigned in a number
of ways. As an experimental platform, we developed a design-led development process for creating an exemplar
multiuse play space in Trondheim, Norway, resulting in Adressaparken. Adressaparken – an interactive installation
park – was designed and implemented in Trondheim, Norway as a platform for sensor based digital storytelling [8].
Adressaparken is a public park of around 1300 square meters surrounding the head office of the local newspaper –
Adresseavisen. The project is co-owned by the municipality of Trondheim, Adresseavisen and NTNU. More
information can be found at https://www.ntnu.edu/thepark/.
The current technical infrastructure of Adressaparken comprises 12 custom-built sensor boxes; eight reusable
support mounts for displays or screens; tree, tunnel, and river-side connectivity and power boxes; six outdoor speakers;
and two display projectors; and nine controllable walk-over LED light lines distributed all over the park. These 12
sensor boxes house smart sensors, mobile equipment, a sensor gateway, and power and Internet connections. All boxes
in the park contain power; USB ports; DMXS12 (a digital multiplex); a high-definition multimedia interface; a VGA;
RJ45 sockets; and Ethernet, Wi-Fi, and fiber-optic connections. We accommodate most of today's multimedia gadgets,
sensor connectivity, and power requirements in a secure, hammer-proof glass casing protected from extreme weather
conditions. We also installed sensors all over the park to monitor the temperature, air, light, sun, noise, pollution, and
presence of people. The driving force was to provide the city and its citizens with an arena for artistic experiences,
development of knowledge and a site for societal debates using sensor based digital stories as a primary focus in the
design process. The storytelling platform provides us with a unique platform for QoE assessment of new digital media
and immersive and interactive content where our users are the general public participating and experiences the story.
So far, we have achieved:
1. Through our successful design and placemaking methodologies, we have implemented a place that has the
characteristics of a successful public place for new digital media experiences;
2. Adressaparken promotes sociability by acting as a gathering place for frequent and meaningful interaction;
it offers activities, opportunities and immersive and interactive content for play;
3. It creates a thriving environment for art, technology, digital awareness, and cultural activities.
For experimentation and expressions, we are offering our Adressaparken infrastructures to anyone who would like to
collaborate, design and create their own projects and activities. Figure 2 shows the three exhibitions currently running
in the park. In \/\/iFi, we get tracked by our movement and smartphone signals and usage. What if we are visually
aware of that data and analytics? In Adressaparken visitors can connect to the \/\/iFi hotspots and get the story exposing
the hidden waves of our digital behaviors and movements in Adressaparken and plays with the digital jungle gym we
unconsciously already build. In Current the story is told thinking about the site of Adressaparken as a public space,
and as an interactive site with sensors that register information about the temperature of its environment. The
collaborative work is centered around interaction with natural spaces and phenomena connecting a glacier and working
with programming effects that could create connection between Adressaparken, a glacier and the digital field. We
thought of currents of traffic and people at Adressaparken, and ice and water being in movement as well as ways to
show fluidity and transformation happening in digital space. RGB Playspace is an open-ended facility for creative
empowerment that can be manipulated and transformed through play. This public art installation invites children and
adults alike to physically explore the interactive space to instantly become a composer of music and lights. RGB
Playspace is an example of art that brings technological change that can benefit everyone. There is a worldwide social
issue of digital awareness being not yet evenly distributed. By bringing art and technology into public spaces, the artist
hopes to pilot a new approach in bringing human community and social technology together - through playfulness and
humility in face of complexities.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 31/56 Vol. 13, No. 1, January 2018
Figure 2. Adressaparken (More information can be found at https://www.ntnu.edu/thepark/).
4.2 VisualMedia
Our second use case is from the traditional media industry. This industry has to adapt to the changes happening around
them. One point they have to embrace is to bring the news to different devices and platforms. This includes, TV,
websites, social media, or apps. The story needs to be adapted to each platform individually, as each of them is used
in a different way. In case of news stories for transmedia, it is required to have the content prepared differently for
each of the devices. The change in traditional broadcasting and media sector, and addressing the challenge to win back
the younger generation has also been the main focus of the VisualMedia project [9]. In order to immerse youngsters
and give them a voice, different approaches have been implemented. The project developed a workflow and tool, such
that broadcasters now have the possibility to easily search for content on different social media channels and put those
on live TV. Furthermore, VisualMedia offers accompanying apps that can be used by viewers for polls, published by
the TV channel, or interact with the TV show in different ways. Making the viewer an integral part of a TV show and
having visually appealing graphics. This will give the viewer the feeling of a better level of immersion in the show
and thus a better QoE. Figure 3 shows how user partner TVR (national TV in Romania) is using VisualMedia to
display 3D graphics of social media posts in their sports show.
Figure 3. Case scenario of Romanian national TV (TVR) within the VisualMedia project
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 32/56 Vol. 13, No. 1, January 2018
5. Conclusions
The best way to be immersed is through a story, be it text, a play, music, a concert or a digital media experience. Thus,
stories can be analogue or digital, where our focus is on the digital stories. A key question in creating and improving
the best digital stories is how to model and assess such an immersive experience through refining and using the
measure Quality of Experience – QoE. As most immersive digital stories rely on some sort of sensors we have taken
this as a first approach and reviewed some recent trends in QoE for immersive experiences. Our research has focused
mainly on the creative aspects of new digital media, designing new platforms for combining art and technology and
creating immersive and interactive content in public spaces through Adressaparken and physiological measures for
QoE. Both areas are receiving a lot of attention in the quality evaluation community and are important aspects of
understanding immersion.
6 Acknowledgements
This work has partially been funded by the European Union's Horizon 2020 research and innovation programme under
grant agreement No 687800.
References [1] P. L. Callet, S. Möller and A. Perkis, “Qualinet White Paper on Definitions of Quality of Experience,” European Network on Quality of
Experience in Multimedia Systems and Services (COST Action IC 1003), Lausanne, Switzerland, Version 1.1, Jun. 2012. [2] Zhang, C., Hoel, A. S., & Perkis, A., “Quality of Immersive Experience in Storytelling: A Framework”. In Proc. IEEE Int. conf. on Quality of
Multimedia Experience (QoMEX 16), June 2016.
[3] Perrin, A. F., Řeřábek, M., & Ebrahimi, T., ”Towards prediction of Sense of Presence in immersive audiovisual communications” in
Electronic Imaging, pp. 1-8., 2016
[4] Antons, J.-N., Schleicher, R., Arndt, S., Möller, S., Porbadnigk, A. K. & Curio, G. (2012). Analyzing Speech Quality Perception using
Electro-Encephalography. Journal of Selected Topics in Signal Processing. IEEE, 721-731. [5] Redi, J. A., Zhu, Y., de Ridder, H., Heynderickx, I., ”How passive image viewers became active multimedia user”, in Visual Signal Quality
Assessment Springer International Publishing, 205, pp. 31-72.
[6] Mansilla, W. A., & Perkis, A., “Design and storytelling concepts in the quality of experience”, in Proc. IEEE Int. conf. on Quality of
Multimedia Experience (QoMEX 15), May 2015, pp. 1-6.
[7] Arndt, S., Antons, J.N., Schleicher, R., Möller, S. (2016). Using electroencephalography to analyze sleepiness due to low-quality audiovisual
stimuli. Signal Processing Image Communication (42). 120-129 [8] Wendy Ann Mansilla, Andrew Perkis, “Multiuse Playspaces: Mediating Expressive Community Places”, in IEEE MultiMedia vol. 24 no. 1,
2017, p. 12-16, (http://folk.ntnu.no/wendyann/Adressaparken_toolkit/).
[9] Arndt, S., Räty, V.P., Perkis, A. (2016). Opportunities of Social Media in TV Broadcasting. Proceedings of the 9th Nordic Conference on
Human-Computer Interaction. 123
Andrew Perkis received his Siv.Ing and Dr. Techn. Degrees in 1985 and 1994, respectively. In
2008, he received an executive Master of Technology Management in cooperation from NTNU,
NHH and NUS (Singapore). His current research focus is within the synergies of art and
technology (NTNU ARTEC), methods and functionality of content representation, quality
assessment and its use within the media value chain. His application focus is on art in public
spaces, sensor based digital storytelling, change management and business modelling for the creative and media
industry.
Sebastian Arndt is a post doc at the Norwegian University of Science and Technology, NTNU, in
Trondheim. He studied Computer Science at Technische Universität Berlin and received his diploma
in 2010. He received his doctoral degree (Dr.-Ing.) in 2015 from TU Berlin in the group of 'Quality
and Usability'. His current research is focusing on usability evaluation methods, and developing user
requirements (user-centered design) in the context of TV broadcasting which also includes evaluating
implemented systems. Furthermore, his research interests are in the area of Quality of Experience (QoE) of multimedia
and using (neuro)physiological methods to enhance well-being and quality of life.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 33/56 Vol. 13, No. 1, January 2018
SPECIAL ISSUE ON Content Caching and Sharing in Wireless Networks
Guest Editors:
Zheng Chang, University of Jyväskylä, Finland
Zhenyu Zhou, North China Electric Power University, China
[email protected], [email protected]
The proliferation of smart mobile devices and multimedia services has resulted in the explosive
increase of wireless data traffic. To cope with this challenge on data demand, caching content
and sharing at the network edge, including base stations and devices, has been proposed to
offload network traffic by pulling content closer to end users. With network edge caching, the
multimedia application data can be stored and shared in an efficient and distributed manner, and
thus can alleviate the burden of data transmission on the backhaul, improve the latency
performance and also save the battery of mobile devices. Meanwhile, due to its inherent nature
of caching and sharing at the network edge, there are many challenges ahead towards a reliable,
accurate and efficient mechanism for content caching and sharing among a number of mobile
devices. Specially, due to the large number of multimedia applications over massive devices and
rapid development of advance wireless technologies, content caching and sharing design is of
profound importance.
The 4 papers included in this special issue on content caching and sharing in wireless netwoirks
to address a number of noteworthy challenges and present the corresponding solutions and
suggestions. These contributions are made by authors who are renowned researchers in the field,
and the audience will find in these papers the research advances for enhanced content centric
wireless network for the multimedia services in terms of better efficiency and reliablity, among
many other metrics. Each of these 4 papers is briefly introduced in the following paragraphs.
As there is a trend to use renewable energy for the future heterogeneous networks, how to cope
the content caching with the energy harvesting capability is one of the most significant problems.
In contribution, “Content Caching and Push in Small Cells with Renewable Energy”, Jie Gong
explore the content information to design the joint caching and push mechanism in the small-cell
base stations (SBSs) powered by renewable energy. The problem is formulated as a Markov
decision process by exploring the features of content popularity and renewal and by taking into
consideration the energy consumption for both content fetch from core network and push to the
users. The objective is to minimize the number of requests which cannot be met by the SBSs.
Energy efficiency is a critical metric in the content caching system. “Energy Efficiency Analysis
of 5G Content Caching System” introduces a comprehensive system model for 5G with the
content caching mechanism. The authors introduce the caches into in-network router, BS and the
neighboring smart device sides with caching mechanism. With the smart portable devices, the
user can obtain its request contents from neighboring user’s caches. Additionally, it can also
obtain the request contents from in-network router or BS caches. The energy efficiency
performance of core network, long distance as well as short distance scenarios are then
investigated with achievable sum rate and consumed energy analyzes. It is a versatile model that
can be adopted by similar work as well.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 34/56 Vol. 13, No. 1, January 2018
Reducing the energy consumption of the wireless network is able to improve the energy
efficiency of the content caching system, while sometimes at the cost of longer transmission
delay. In contribution “”,Thang X. Vu, Lei Lei, and Satyanarayana Vuppala investigate the
energy efficiency performance of content delivery networks in which a data center serves
multiple users via a shared wireless medium. Focusing on latency-tolerant applications, the
authors propose energy-efficient precoding design and optimization that minimize the total
energy consumption while guaranteeing some given quality of service constraints. In particular,
an energy-buffering time trade-off is derived in a closed-form expression for single-user
scenarios, which reveals the impact of the key system parameters on the total energy
consumption. An energy minimization problem is formulated with a minimum mean square error
(MMSE)-based precoding design for multiple-user scenarios and addressed via a linear
approximation of the non-convex constraint.
Due to the inherent nature of vehicular network, the design of content caching and sharing is of
profound significance. In contribution, “Cooperative Content Caching and Distribution in
Multihop D2D-V2V Networks”, Yahui Wang, Zhenyu Zhou, Houjian Yu, and Chen Xu present
investigate how to achieve dependable content distribution in device to device (D2D) based
cooperative vehicular networks by combining big data based vehicle trajectory prediction with
coalition formation game based resource allocation, determine the formation of content
distribution groups with different lifetimes as a coalition formation game, and evaluate the delay
performance based on real-world map and realistic vehicular traffic. It can be observed that with
big data analytic capability, the content distribution scheme for vehicular network can
significantly improve the delay performance.
The guest editors would like to give our special thanks to all the authors for making contribution
to this special issue. We are also thankful to the MMTC Communications–Frontier Board for
providing helpful support.
Guest Editor Zheng Chang received the B.Eng. degree from Jilin University, Changchun, China in 2007,
M.Sc. (Tech.) degree from Helsinki University of Technology (Now Aalto University), Espoo,
Finland in 2009 and Ph.D degree from the University of Jyväskylä, Jyväskylä, Finland in 2013.
Since 2008, he has held various research positions at Helsinki University of
Technology, University of Jyväskylä and Magister Solutions Ltd in Finland.
He was a visiting researcher at Tsinghua University, China, from June to
August in 2013, and at University of Houston, TX, during from April to May
in 2015. He has been awarded by the Ulla Tuominen Foundation, the Nokia
Foundation and the Riitta and Jorma J. Takanen Foundation for his research
work. Currently he is working as a Assistant professor with University of
Jyväskylä and his research interests include cloud/edge computing, radio
resource allocation, and green communications. He is an Editor of IEEE
Access, Wireless Network and MMTC communication Frontier, and a guest
editor of IEEE Communications Magazine, IEEE Wireless Communications,
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 35/56 Vol. 13, No. 1, January 2018
Wireless Communications and Mobile Computing, and IEEE Access. He serves as a TPC
member for numerous IEEE conferences, such as INFOCOM, ICC and Globecom, and reviewer
for major IEEE Journals, such as IEEE TVT, TWC, JSAC, TMC, ToN etc. He has received best
conference paper awards from IEEE APCC and IEEE TCGCC in 2017.
Zhenyu Zhou received his M.E. and Ph.D degree from Waseda University, Tokyo,
Japan in 2008 and 2011 respectively. From April 2012 to March 2013, he was
the chief researcher at Department of Technology, KDDI, Tokyo, Japan. From
March 2013 to now, he is an Associate Professor at School of Electrical and
Electronic Engineering, North China Electric Power University, China. He is
also a visiting scholar with Tsinghua-Hitachi Joint Lab on Environment-
Harmonious ICT at University of Tsinghua, Beijing from 2014 to now. He
served as an Associate Editor for IEEE Access, and a Guest Editor for IEEE Communications
Magazine and Transactions on Emerging Telecommunications Technologies. He also served as
workshop co-chair for IEEE ISADS 2015, and TPC member for IEEE Globecom, IEEE CCNC,
IEEE ICC, IEEE APCC, IEEE VTC, IEEE Africon, etc. He is a voting member of P1932.1
Working Group. He was the recipient of the IEEE Vehicular Technology Society "Young
Researcher Encouragement Award" in 2009, the “Beijing Outstanding Young Talent Award in
2016, the IET Premium Award in 2017, and the IEEE ComSoc Green Communications and
Computing Technical Committee 2017 Best Paper Award. His research interests include green
communications, vehicular communications, and smart grid communications. He is a senior
member of IEEE.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 36/56 Vol. 13, No. 1, January 2018
Content Caching and Push in Small Cells with Renewable Energy
Jie Gong
School of Data and Computer Science,
Sun Yat-sen University, Guangzhou 510006, China
Abstract: In this paper, we explore the content information to design the joint caching and push
mechanism in the small-cell base stations (SBSs) powered by renewable energy. The problem is
formulated as a Markov decision process by exploring the features of content popularity and
renewal and by taking into consideration the energy consumption for both content fetch from core
network and push to the users. The objective is to minimize the number of requests which cannot
be met by the SBSs. We adopt the policy iteration algorithm to obtain the optimal caching and
push policy. The performance gain of the proposed algorithm is shown in the numerical results.
1. Introduction
Recently, energy harvesting (EH) technology [1] has been considered as one of the candidate
technologies for green communications. However, due to the randomness of energy arrival and
limitations on the battery capacity, energy waste or shortage will occur when the energy harvesting
process and the traffic pattern mismatches with each other in either spatial or time domain. To
improve the efficiency of the harvested energy, one should adjust the power allocation policy using
the traffic information to re-shape the energy profile to match the traffic profile. As users may be
interested in the same content (latest news, popular videos and etc.), lots of repeated transmissions
can be reduced if the content information is fully utilized.
The content caching and push mechanism is viewed as a promising way to improve the efficiency
of content delivery in wireless network. To reduce the core network overhead, contents are
suggested to be cached at the small-cell BSs (SBSs) [2] with proactive caching. On the other hand,
with the improvement of data storage capacity, user devices are capable of storing large amount
of data. Hence, the content push mechanism [3] is developed based on wireless multicast. With
renewable energy, ref. [4] uses EH based SBSs to cache contents for the deployment flexibility
and energy consumption reduction, and ref. [5] designs the energy-aware resource allocation
algorithm with limited content cache. However, joint content caching and push policy design using
renewable energy is still an open problem.
In this paper, we combine the EH technology with the content caching and push by considering
EH powered SBSs under the GreenDelivery framework [6]. Specifically, with the non-negligible
energy consumption offetching contents from the core network, the SBS can not cache all the
contents due to the limited renewable energy. It needs to decide when to fetch, push or unicast
contents depending on the energy condition. We optimize the joint caching and push policy using
Markov decision process (MDP) [7] approach. Numerical results are provided to illustrate the
influence of cache size under different parameter settings as well as the tradeoff between the
number of cached contents in the SBS and the available energy for content push.
2. System Model and Problem Formulation
Consider a two-tier heterogeneous cellular network composed of a macro-cell BS (MBS) and a
second-tier small-cell with radius 𝑅, as shown in Fig. 1. The MBS is powered by the power grid
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 37/56 Vol. 13, No. 1, January 2018
and the SBS is powered by renewable energy, and the harvested energy can be stored in a battery
with capacity 𝐵max. There is a dedicated wired/wireless backhaul link for the SBS to fetch contents
from the core network through the MBS, which consumes a fixed amount of energy 𝐸𝑓. The SBS
has a limited content cache size 𝑁. The content is transmitted with a constant data rate, and hence,
the transmission power depends on the distance between user and SBS.
Fig. 1. Two-tier heterogeneous cellular network
The system is slotted with time slot length 𝑇𝑠. The contents are assumed of the same length and
can be completely delivered in a time slot. The popularity of the contents is well fitted by the Zipf
distribution [8]. Specifically, the popularity of the 𝑖-th ranked content can be expressed as
𝑓𝑖 =1/𝑖𝑣
∑ 1𝑁𝑗=1 /𝑗𝑣
, (1)
where 𝑣 ≥ 0 is the skew parameter, N is the total number of contents. In additon, Assume in each
slot, a content leaves the system and is replaced by a new one with probability 𝑝𝑐 ∈ [0,1]. The
leaving content is uniformly chosen from 1,2, ⋯ , 𝑁.
There are two content cache states in this system, i.e., the number of cached contents in the SBS
��𝑘 and those at users ��𝑘, where 𝑘 is the time index. For optimality, the SBS and users always cache
the most popular contents. The SBS’s action includes: fetch a content from the MBS, unicast the
required content to a specific user, push a content to all users, or sleep. It can be denoted by 𝑢𝑘 =(��𝑘, ��𝑘), where ��𝑘 ∈ {0,1} indicates the fetch action, ��𝑘 ∈ {0,1,2} indicates sleep, unicast or push.
Notice that the backhaul link is orthogonal to the downlink unicast or push.
The user request is assumed to follow the Bernoulli distribution, i.e., there is a content request
with probability 𝑝𝑢 ∈ [0,1] in each time slot. The user request can be represented by 𝑄𝑘 = 𝑃𝑡(𝑑)𝑇𝑠,
where d is the transmission distance. The required energy for content push is 𝐸𝑝 = 𝑃𝑡(𝑅)𝑇𝑠. Set
𝑄𝑘 = 0 to indicate either there is no request or the content is in users' cache. The battery energy
state 𝐸𝑘 is updated as 𝐸𝑘+1 = min { 𝐵max, 𝐸𝑘 − 𝑈𝑘 + 𝐴𝑘}, where 𝑈𝑘 is the energy used for
transmission which satisfies 𝑈𝑘 ≤ 𝐸𝑘 , and 𝐴𝑘 is the harvested energy in period 𝑘 , which is
assumed i.i.d. with average ��. Our problem can be described as minimizing the ratio of user
requests handled by the MBS over the total user requests by adjusting the behavior of the SBS
under the energy constraint.
3. Optimal Policy Design
To find the optimal solution, we need to decide the SBS’s action based on the system state at the
beginning of each time slot. MDP [7], also termed as dynamic programming (DP), is an effective
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 38/56 Vol. 13, No. 1, January 2018
tool to solve this type of problems and is widely used for the control optimization of stochastic
process. A standard MDP problem contains the following elements: state, action, cost function,
and state transition. In our problem, the state xk includes the battery state Ek, user request Qk, and
cache states ��𝑘, ��𝑘 in the SBS and users, the action uk includes fetch, push, unicast, and sleep, the
cost gk(xk, uk) is an indicator whether a user request is denied by the SBS. Then the optimization
problem can be re-written as
min lim𝐾→+∞
1
𝐾𝔼 [∑ 𝑔
𝐾−1
𝑘=0
(𝑥𝑘, 𝑢𝑘(𝑥𝑘))]. (2)
The expectation operation is taken over all the random parameters including energy arrival, user
request, and content update. The optimization is taken over all the possible policies {𝑢1, 𝑢2, ⋯ }. It
can be proved that there exists an optimal stationary policy 𝑢∗, and the optimal average cost 𝜆∗
together with some vector ℎ∗ = {ℎ∗(𝑥)|𝑥 ∈ 𝒮} satisfies the Bellman’s equation
𝜆∗ + ℎ∗(𝑥) = min𝑢∈𝒰(𝑥)
[𝑔(𝑥, 𝑢) + ∑ 𝑝𝑥→𝑦|𝑢
𝑦∈𝒮
ℎ∗(𝑦)]. (3)
Furthermore, if 𝑢∗(𝑥) attains the minimum value of (2) for each 𝑥, the stationary policy 𝑢∗ is
optimal. Based on the Bellman’s equation, the policy iteration algorithm can effectively solve the
problem. Suppose in the 𝑗-th step, we have a stationary policy denoted by 𝑢(𝑗). Based on this policy,
we perform policy evaluation step, i.e.,
𝜆(𝑗) + ℎ(𝑗)(𝑥) = 𝑔(𝑥, 𝑢(𝑗)(𝑥)) + ∑ 𝑝𝑥→𝑦|𝑢(𝑗)(𝑥)
𝑦∈𝒮
ℎ(𝑗)(𝑦) (4)
for ∀𝑥 ∈ 𝒮 to get the average cost 𝜆(𝑗) and vector ℎ(𝑗). As 𝑢(𝑗) may not be the optimal policy, we
subsequently perform policy improvement step to find the policy 𝑢(𝑘+1) which minimizes the right
hand side of Bellman’s equation
𝑢(𝑗+1)(𝑥) = arg min𝑢∈𝒰(𝑥)
[𝑔(𝑥, 𝑢) + ∑
𝑦∈𝒮
𝑝𝑥→𝑦|𝑢ℎ(𝑗)(𝑦)]. (5)
If 𝑢(𝑗+1) = 𝑢(𝑗), the algorithm terminates, and the optimal policy is obtained 𝑢∗ = 𝑢(𝑗). Otherwise,
repeat the procedure by replacing 𝑢(𝑗) with 𝑢(𝑗+1) . It is proved that the policy iteration
algorithmterminates in finite number of iterations.
4. Numerical Results
In this section, we run some numerical simulations for performance evaluation. We set the cell
radius 𝑅 = 50m, the required content delivery spectrum efficiency 𝑟0/𝑊 = 1bps/Hz, the pathloss
parameters 𝛽 = 10dB and 𝛼 = 2, and the Zipf parameter 𝑣 = 1. The maximum transmit power or
equivalently the transmit power for cell-edge user is set 𝑃𝑡(𝑅) = 1Watt. The channel coefficient
ℎ follows Rayleigh fading. The quantized battery capacity is set to 𝐸max = 12. The energy arrival
process follows a Poisson distribution with average arrival rate �� units of energy.
We compare the proposed algorithm with some heuristic algorithms to demonstrate the
importance of joint optimization of content caching and push. We consider the following baseline
algorithms: greedy fetch policy, in which the SBS always fetches contents as long as there is
sufficient energy and the cache in the SBS is not full, threshold fetch policy, in which the SBS
fetches contents if the cache size in the SBS does not achieve a pre-defined threshold 𝑁th, and
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 39/56 Vol. 13, No. 1, January 2018
non-push policy, in which the SBS only unicasts the required contents to the users on demand.
Compared with greedy fetch policy, our optimal policy reduces the SBS blocking probability by
more than 40%. The results of threshold fetch policies with different 𝑁th show a tradeoff between
the number of cached contents in the SBS and the available energy for content push. In our settings,
the optimal threshold is 𝑁th = 5. In addition compared with non-push policy, the greedy fetch
policy can achieve more than 20% blocking probability reduction, which illustrates the great
performance improvement by push.
Fig. 2. Comparison with some heuristic algorithms. 𝒑𝒖 = 𝟎. 𝟓, 𝒑𝒄 = 𝟎. 𝟑, �� = 𝟎. 𝟔, 𝑬𝒇 = 𝑬𝐮𝐧𝐢𝐭, 𝑵 = 𝟖.
5. Conclusion
In this paper, content caching and push mechanism in EH-powered SBSs is jointly optimized. The
proposed policy iteration algorithm solves the problem, and the obtained optimal policy performs
much better than the heuristic greedy fetch policy and non-push policy. Using more energy to fetch
improves the content availability in the SBS, but degrades the energy availability for push/unicast.
The proposed optimal policy well balances the content availability and the energy availability.
REFERENCES [1]. D. Gunduz, K. Stamatiou, N. Michelusi, and M. Zorzi, “Designing intelligent energy harvesting
communication systems,” IEEE Communications Magazine, vol. 52, no. 1, pp. 210–216, Jan. 2014. [2]. N. Golrezaei, A. F. Molisch, A. G. Dimakis, and G. Caire, “Femtocaching and device-to-device
collaboration: A new architecture for wireless video distribution,” IEEE Communications Magazine, vol. 51, no. 4, pp. 142–149, Apr. 2013.
[3]. I. Podnar, M. Hauswirth, and M. Jazayeri, “Mobile push: Delivering content to mobile users,” in Proc. 22nd International Conference on Distributed Computing Systems Workshops, 2002.
[4]. N. Sharma, D. Krishnappa, D. Irwin, M. Zink, and P. Shenoy, “Greencache: Augmenting off-the-grid cellular towers with multimedia caches,” in Proc. ACM MMsys13, Feb. 2013.
[5]. A. Kumar and W. Saad, “On the tradeoff between energy harvesting and caching in wireless networks,” in IEEE International Conference on Communications (ICC), London, U.K., 2015.
[6]. S. Zhou, J. Gong, Z. Zhou, W. Chen, and Z. Niu, “Greendelivery: Proactive content caching and push with energy-harvesting-based small cells,” IEEE Communications Magazine, vol. 53, no. 4, pp. 142–149, Apr. 2015.
[7]. D. P. Bertsekas, Dynamic programming and optimal control, Volume II, 3rd edition. Athena Scientific Belmont, MA, 2005.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 40/56 Vol. 13, No. 1, January 2018
[8]. M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon, “I tube, you tube, everybody tubes: Analyzing the world’s largest user generated content video system,” in Proc. 7th ACM SIGCOMM Conference on Internet measurement. ACM, 2007.
Jie Gong (S'09, M'13) received his B.S. and Ph.D. degrees in Department of Electronic
Engineering in Tsinghua University, Beijing, China, in 2008 and 2013, respectively.
From July 2012 to January 2013, he visited Institute of Digital Communications,
University of Edinburgh, Edinburgh, UK. During 2013-2015, he worked as a
postdoctorial scholar in Department of Electronic Engineering in Tsinghua University,
Beijing, China. He is currently an associate research fellow in School of Data and
Computer Science, Sun Yat-sen University, Guangzhou, Guangdong, China. He served
as workshop co-chair for IEEE ISADS 2015 and TPC member for the IEEE/CIC ICCC
2016/17, the IEEE WCNC 2017, the IEEE Globecom 2017, the IEEE CCNC 2017, and the APCC 2017.
He was a co-recipient of the Best Paper Award from IEEE Communications Society Asia-Pacific Board in
2013. He was selected as the IEEE Wireless Communications Letters (WCL) Exemplary Reviewer in 2016.
His research interests include Cloud RAN, energy harvesting technology and green wireless
communications.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 41/56 Vol. 13, No. 1, January 2018
Energy Efficiency Analysis of 5G Content Caching System
Di Zhang1,2, Zhenyu Zhou2,3, Zhengyu Zhu1, Shahid Mumtaz4
1School of Information Engineering, Zhengzhou University, Zhengzhou, 450-001, China.
2Department of Electric and Computer Engineering, Seoul National University, Seoul, 151-742,
Korea.
3State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources,
School of Electrical and Electronic Engineering, North China Electric Power University,
Beijing, 102206, China.
4Instituto de Telecomunicações, Aveiro, 1049-001, Portugal.
1. Introduction
Although various studies have been done with ICN's caching and sharing (CS) mechanism on energy efficiency (EE)
topic, it is found that the CS mechanism is separately discussed in prior work. A comprehensive performance
comparison of obtaining the request contents from caches located in in-network router, base station (BS) and
neighboring user side is still in its fancy. That is, obtaining the request contents from where, under what specific
condition, is still ambiguous. This inspires us to develop this treatise. To compare those three scenarios, it is assumed
the request content are distributed to the core router, BS, as well as neighboring users. Those distributions are defined
as the core router, long distance and short distance scenarios for the sake of convenient. The placement problem for
obtaining the request content is finally addressed with the analyzes and numerical results.
The contributions of this study are summarized as follows:
• A comprehensive redesigned system model is introduced for fifth generation (5G) with the CS mechanism.
That is, we introduce the caches into in-network router, BS and the neighboring smart device sides with CS
mechanism. With the smart portable devices, the user can obtain its request contents from neighboring user’s
caches. Additionally, it can also obtain the request contents from in-network router or BS caches.
• The EE performance of core network, long distance as well as short distance scenarios are investigated with
achievable sum rate and consumed energy analyzes. It is a versatile model that can be adopted by similar
work as well.
• Numerical results are used to answer the specific condition of where to obtain the request content problem.
It is found that the short distance scenario has best the EE performance, followed by the long distance and
core router scenarios. However, due to the limited battery of smart devices, in reality, long or core router
scenarios are more reasonable choices.
2. System model
In the proposed system here, massive multi-input-multi-output (MIMO) antenna array is selected as the outdoor BS.
It is assumed that one cell has 20 users with hundreds of massive MIMO antenna arrays, which is a widely used
assumption of massive MIMO BS in 5G [7]. The user’s requested contents can be either fulfilled by the contents
storing in the caches of neighboring users with smart devices, BS and the in-network core routers with CS mechanism.
In contrast, the request contents can be directly retrieved from the remote content server (on condition that there is no
requesting content cached in the cache). Detail information of the optimized system is given by Fig. 1. As shown, in
the system, the CS concept is comprehensively introduced to the neighboring user, BS as well as the in-network router
sides. This is different from the prior literature that separately investigates the CS mechanism from “in-network”, BS
or neighboring user regimes.
Suppose there are two users within one cell area, user A and user B, as shown by Fig. 1. In addition, user A, B, BS
and the in-network routers in the wired core network are capable of CS the temporary hot contents (which are visited
a lot). In this case, whenever user A has a content request, say “objective A request”, it can be obtained from the
caches named “Copy of A”. In contrast, obtaining it from the remote content server as the conventional system model
without the CS mechanism via back-haul links connected to the wireless and wired sections. Compared with obtaining
from the remote content server, the CS mechanism, once applied, can reduce the energy consumption via shorter
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 42/56 Vol. 13, No. 1, January 2018
distance and less components that engaged in the transmission procedure, which yields better system EE performance.
However, by what scale the EE performance will be enhanced is still ambiguous. Moreover, the decision should be
clarified: say under what constraint, distributing the contents and obtaining them from where within the constraint of
this proposed system model should be set forth. This is the focused content distribution problem in this study, which
will be answered by the following sections.
Fig. 1. Description of the proposed system model.
B. Sum Rate
By following the prior analysis in [2], after the zero-forced beamforming (ZFBF), signal to interference plus noise
ratio (SINR) expression of user k will be
𝑆𝑁𝐼𝑅𝑘 = 𝜌𝑘
𝑀 − 𝑁
𝑁,
where 𝜌𝑘 is the signal to noise (SNR) of user k that can be given by 𝜌𝑘 = 𝑃𝑘
𝑃𝑛. Here 𝑃𝑘 , 𝑃𝑛 separately yields the
power of user k and the channel noise power. In addition, M, N the number of transmit and receive antenna. In this
case, by summarizing all the transmission rate within one cellular area, the achievable sum rate of one cellular area
will be
𝑅𝑠𝑢𝑚 = ∑ 𝑅𝑘
𝑁
𝑘=1
= ∑ 𝐵𝑙𝑜𝑔2 (1 + 𝜌𝑘
𝑀 − 𝑁
𝑁) ,
𝑁
𝑘=1
where B is the carrier bandwidth. Note that although the qualitative expression of 𝜌𝑘 has been given, but the specific
values of are still unknown. To settle down this, the following analysis will be employed.
3. Energy Efficiency Analysis
A. The long and short distance scenarios
We define obtaining the contents from caches of neighboring user, BS and core router as the short distance, long
distance and core router scenarios here in this study. By obtaining the requesting contents from neighboring user’s
caches and BSs, the transmission procedures are similar other than the distances that traveled through, thus we
comprehensively give their analysis first. With free space propagation model in hand, the SNR can be rewritten as
𝑃𝑘 =𝑃𝑡[
√𝐺𝑙𝜆
4𝜋𝑑]
2
𝑃𝑛.
Here 𝑝𝑡 is the emission power at the transmitter side, 𝐺𝑙 the coefficient of field radiation patterns in light of sight
(LoS) direction, 𝜆 the wavelength, d distance from transmitter to receiver, respectively. In line with prior work in
[3], the received noise power at receiver side can be given as
𝑃𝑛 = −174 + 10𝑙𝑜𝑔10𝐵 (𝑑𝐵𝑚). The achievable sum rate within one cellular area turns out to be
𝑅𝑠𝑢𝑚 = ∑ 𝐵𝑙𝑜𝑔2 (1 +𝑃𝑡[
√𝐺𝑙𝜆
4𝜋𝑑]
2
(𝑀−𝑁)
𝑁(−174+10𝑙𝑜𝑔10𝐵 (𝑑𝐵𝑚)))𝑁
𝑡=1 .
On condition that the requested content is obtained from the BS cache, power consumption can be estimated by
𝑃𝑤 = 𝑃𝑡 ,𝑡𝑜𝑡𝑎𝑙+ 𝑃𝑏𝑠 + 𝑃𝑅𝐹 + 𝑃𝑐𝑖𝑟𝑐𝑢𝑖𝑡 ,
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 43/56 Vol. 13, No. 1, January 2018
where 𝑃𝑡 ,𝑡𝑜𝑡𝑎𝑙 is the power consumption of massive MIMO antenna array with 𝑃𝑡 ,𝑡𝑜𝑡𝑎𝑙 = ∑ 𝑃𝑡 ,𝑁𝑡=1 𝑃𝑏𝑠, 𝑃𝑅𝐹 , 𝑃𝑐𝑖𝑟𝑐𝑢𝑖𝑡
respectively denote the power consumption of massive MIMO array, BS machine room, radio frequency (RF) and
the circuit of this transmission procedure. Typically, RF power consumption is around 100 ∼ 200 mW, which is
ignored in the analysis. This gives the EE performance of long distance scenario
𝜂𝑒𝑒 ,𝑙𝑜𝑛𝑔 = 𝑅𝑠𝑢𝑚
𝑃𝑡,𝑡𝑜𝑡𝑎𝑙 + 𝑃𝑏𝑠 + 𝑃𝑐𝑖𝑟𝑐𝑢𝑖𝑡
.
In the short distance scenario, there is no power consumption from BS, circuit, by following a similar analysis
procedure, the EE expression can be given as
𝜂𝑒𝑒 ,𝑠ℎ𝑜𝑟𝑡 = 𝑅𝑠𝑢𝑚(4𝜋𝑑2)2
2𝑅𝑠𝑢𝑚
𝐵−1𝑃𝑛(√𝐺𝑙𝜆)
2.
It is worth to note that in short distance scenario, the power threshold of user equipment currently is around 1 ∼ 2
W.
B. The core router scenario
The power consumption of core router scenario can be estimated as
𝑃𝑐 = 𝑁𝑛 ∗ 𝑃𝑐𝑐 + (𝑁𝑛 + 1)𝑃𝑜 + 𝑃𝑤 , here 𝑁𝑛 is the number of core network equipment. Additionally, 𝑃𝑐𝑐 , 𝑃𝑜 are the power consumptions of one pair of
core network, optical fiber, respectively. The reason that needed optical fiber is 𝑁𝑛 + 1 giving 𝑁𝑛 is that, the optical
fiber link is need from BS to the first router by adopting a simple equal optical fiber distance from BS to core router,
core router to core router and core router to remote center. Moreover, 𝑃𝑐𝑐 can be given as
𝑃𝑐𝑐 = 𝑃𝑡𝑟𝑎𝑛𝑠 + 𝑃𝑝ℎ𝑦 + 𝑃𝑚𝑎𝑐 + 𝑃𝑡𝑝 + 𝑃𝑓𝑖 + 𝑃𝑚𝑒𝑚 + 𝑃𝑙𝑐 + 𝑃𝑝𝑠,
where 𝑃𝑡𝑟𝑎𝑛𝑠 , 𝑃𝑝ℎ𝑦 , 𝑃𝑚𝑎𝑐 , 𝑃𝑡𝑝 , 𝑃𝑓𝑖 , 𝑃𝑚𝑒𝑚 , 𝑃𝑙𝑐 , 𝑃𝑝𝑠 respectively denote the power of transceiver, physical player (such as
enconding/decoding, scrambing/descrambling, forward error correction (FEC)), Mac layer (such as mapping,
framing), transport profile/forward error (TP/FE) (such as packet processing, classifying), fabric interference, line
card, and the packet switch. On condition that one line card is used in the transmission with all the others turned off
to save energy, according to the estimation in [4], their values are 5.9 W, 3.4 W, 30.6 W, 183.6 W, 61.2 W, 13.6 W,
298.3 W, 224.4 W, respectively. By further taking the fan section into consideration, according to prior estimation
[5, 6], we have 𝑃𝑐𝑐′ =
100∗𝑃𝑐𝑐
67 . In this case, EE performance of core router scenario will be
𝜂𝑒𝑒 ,𝑐𝑜𝑟𝑒 = 𝑅𝑠𝑢𝑚
𝑃𝐶.
4. Simulation results
The simulation parameter that used here is given as Table I, in line with prior work [7, 8] and the 3GPP documents.
By comparing all the three scenarios, observation has that the short distance scenario consumed the least power
following by the long distance scenario and core router cache scenarios. In this regard, the short distance scenario
displays the best system EE performance, followed by the long distance scenario and core router scenario. All of
the cache-enabled scenarios display better EE performance than prior studies. In this case, ICN’s CS mechanism can
reduce the power consumption and enhance the system EE performance. In addition, while adopting the CS
mechanism, for distance less than 0.2 m (typically in the indoor environment), short distance scenario is a good
choice with the best system EE performance. Generally, in the outdoor environment, the long distance scenario can
be a more feasible choice.
Table I. Simulation parameters.
parameters values 𝐺𝑙 1
Carrier frequency f 1900 MHz User power threshold 𝑃𝑡ℎ 2 W
Carrier bandwidth B 20 MHz parameters values
transmit antenna number M 100 BS range 𝑑1 350 m
Receiver number N 20 BS power consumption 𝑃𝑏𝑠 400 W
Per user request 20 MBit/s 𝑃𝑐𝑖𝑟𝑐𝑢𝑖𝑡 160.8W
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 44/56 Vol. 13, No. 1, January 2018
Fig. 2. EE performance of core router scenario.
Fig. 3. EE performances of short, long distance scenarios.
5. Conclusion
One comprehensive system model was introduced with CS mechanism in this paper. Based on the proposed system
model, EE performance analysis was comprehensively investigated with in-network core router, short and long
distance scenarios. Simulation results demonstrated that all the three scenarios displayed better EE performance
compared to the prior studies without the CS mechanism. While applying the CS mechanism in the outdoor
environment, long distance scenario is a more reasonable choice. In the indoor and other scenarios with closer user
to user distance, short distance scenario is better than the long distance scenario. Otherwise, the core router scenario
is a feasible choice compared to the without CS mechanism.
Reference
[1] E. Larsson, O. Edfors, F. Tufvesson, and T. Marzetta, “Massive MIMO for next generation wireless systems,”
IEEE Commun. Mag., vol. 52, no. 2, pp. 186–195, Feb. 2014.
[2] M. Jung, T. Kim, K. Min, Y. Kim, J. Lee, and S. Choi, “Asymptotic distribution of system capacity in multiuser
MIMO systems with large number of antennas,” in IEEE VTC, Jun. 2013, pp. 1–5.
[3] Q. Gu, “RF system design of transceivers for wireless communication.” Springer US, 2005.
[4] S. Aleksic, “Analysis of power consumption in future high-capacity network nodes,” J. Opt. Commun. and
Netw., vol. 1, no. 3, pp. 245–258, Aug. 2009.
[5] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The cost of a cloud: Research problems in data center
networks,” SIGCOMM Comput. Commun. Rev., vol. 39, no. 1, pp. 68–73, Dec. 2008.
[6] A. C. Orgerie, M. D. d. Assuncao, and L. Lefevre, “A survey on techniques for improving the energy efficiency
of large-scale distributed systems,” ACM Comput. Surv., vol. 46, no. 4, pp. 4701–4731, Mar. 2014.
[7] Z. Zhou, J. Gong, Y. He, and Y. Zhang, “Software defined Machine-to-Machine communication for smart
energy management,” IEEE Commun. Mag., vol. 55, no. 10, pp. 52–60, Oct. 2017.
[8] C. Lab, “Cloud radio network white paper (3rd edition),” Tech. Rep., Jun. 2014.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 45/56 Vol. 13, No. 1, January 2018
Di Zhang (S’13-M’17) received his Ph.D. degree with honor from Waseda University, Tokyo,
Japan (2013-2017), MSc. degree with honor from Central China Normal University, Wuhan,
China (2010-2013). Currently, he is an Assistant Professor with the Zhengzhou University,
Zhengzhou, 450-001, China, and also a Senior Researcher with the Information System
Laboratory, Department of Electrical and Computer Engineering, Seoul National University,
Seoul, 151-742, South Korea. He visited the State Key Laboratory of Alternate Electrical Power
System with Renewable Energy Sources, North China Electric Power University (2015-2017),
and the Advanced Communication Technology Laboratory, National Chung Hsing University
(2012). He served as the TPC member of several IEEE conferences, such as IEEE ICC, WCNC,
VTC, CCNC, Healthcom. His research interests include 5G, internet of things, vehicle communications, green
communications and signal processing.
Zhenyu Zhou (M’11-SM’17) received his M.E. and Ph.D degree from Waseda University,
Tokyo, Japan in 2008 and 2011 respectively. From April 2012 to March 2013, he was the chief
researcher at Department of Technology, KDDI, Tokyo, Japan. From March 2013 to now, he is
an Associate Professor at School of Electrical and Electronic Engineering, North China Electric
Power University, China. He is also a visiting scholar with Tsinghua-Hitachi Joint Lab on
Environment-Harmonious ICT at University of Tsinghua, Beijing from 2014 to now. He served
as an Associate Editor for IEEE Access, and a Guest Editor for IEEE Communications Magazine
and Transactions on Emerging Telecommunications Technologies. He also served as workshop
co-chair for IEEE ISADS 2015, and TPC member for IEEE Globecom, IEEE CCNC, IEEE ICC,
IEEE APCC, IEEE VTC, IEEE Africon, etc. He is a voting member of P1932.1 Working Group. He was the
recipient of the IEEE Vehicular Technology Society "Young Researcher Encouragement Award" in 2009, the
“Beijing Outstanding Young Talent Award in 2016, the IET Premium Award in 2017, and the IEEE ComSoc Green
Communications and Computing Technical Committee 2017 Best Paper Award. His research interests include green
communications, vehicular communications, and smart grid communications. He is a senior member of IEEE.
Zhengyu Zhu (S’12-M’17) received B.S. degree from Henan University in 2010, and received the Ph.D. degree
from Zhengzhou University, China, in 2017. Currently, He is with the School of Information Engineering,
Zhengzhou University, China. His research interests include information theory and signal processing for wireless
communications such as MIMO wireless network, physical layer security, wireless cooperative networks, internet of
things, vehicle communications, convex optimization techniques, and energy harvesting communication systems.
Shahid Mumtaz (SM’16) ([email protected]) has more than seven years of wireless industry
experience and is currently working as a senior research scientist and technical manager at
Instituto de Telecomunicações Aveiro (IT), Portugal, in the 4TELL group. Prior to his current
position, he worked as a research intern at Ericsson and Huawei Research Labs. He received his
M.Sc. and Ph.D. degrees in electrical and electronic engineering from Blekinge Institute of
Technology Karlskrona, Sweden, and the University of Aveiro. His research interests lie in the
field of architectural enhancements to 3GPP networks (i.e., LTE-A user plane and control plane
protocol stack, NAS, and EPC), 5G related technologies, green communications, cognitive
radio, cooperative networking, radio resource management, cross-layer design,
backhaul/fronthaul, heterogeneous networks, M2M and D2D communication, and baseband digital signal
processing. He has more than 80 publications in international conferences, journals, and book chapters.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 46/56 Vol. 13, No. 1, January 2018
Energy-Efficient Design for Latency-tolerant Content Delivery Networks
Thang X. Vu, Lei Lei, and Satyanarayana Vuppala
The Interdisciplinary Centre for Security, Reliability and Trust (SnT), U
niversity of Luxembourg, 29 Avenue John
F. Kennedy, Luxembourg. Email: {thang.vu, lei.lei, satyanarayana.vuppala}@uni.lu
Abstract
In this paper, we investigate the energy efficiency performance of content delivery networks in which a data center serves multiple users via a shared wireless medium. Focusing on latency-tolerant applications, we propose energy-efficient precoding design and optimization that minimize the total energy consumption while guaranteeing some given quality of service constraints. In particular, an energy-buffering time trade-off (EBT) is derived in a closed-form expression for single-user scenarios, which reveals the impact of the key system parameters on the total energy consumption. We then formulate an energy minimization problem with a minimum mean square error (MMSE)-based precoding design for multiple-user scenarios. In order to overcome the non-convexity of the formulated problem, we propose an iterative algorithm which solves the problem suboptimally via a linear approximation of the non-convex constraint. Finally, numerical results are presented to demonstrate the effectiveness of the proposed solution.
Index terms— Content delivery networks, precoding, energy efficiency, latency, optimization.
I. NTRODUCTION
Future content delivery networks will have to address stringent requirements of delivering content at
high speed and low latency due to the proliferation of mobile handsets and data-hungry applications. It
is predicted by Cisco that more than 70% of network traffic will be video in 2018. On the other hand,
only 5–10% of the files are frequently requested, which results in an inefficient utilization of network
resources of the conventional content delivery. One of the promising solutions to improve the resources
utilization is storing the content closer to users in distributed storage, which is referred to content
placement or caching [1]. Caching usually consists of two phases: placement and delivery. The placement
phase is executed during off-peak time when the network resources are redundant. In this phase, popular
content is duplicated and stored in the distributed caches in the network. The later usually occurs during
peak-traffic hours when the users’ demands are requested. If the requested content is available in the
user’s local storage, it can be served locally without being sent via the network. In this manner,
caching allows significant throughput reduction during peak-traffic time and thus reduces network
congestion [1–5].
The joint design of caching and physical layer design has attracted much attention recently. The
basic principle is to take into consideration the caching capacity at the edge nodes when designing
the signal transmission to improve the resources [6–9]. The authors in [6] study the trade-off between
energy consumption and backhaul load during the placement phase in heterogeneous networks. In [7],
a closed-form expression of the energy efficiency is derived showing essential impacts of caching. The
authors in [8] show that significant reduction in transmit power and fronthaul bandwidth can be obtained
via the careful design of cache-aware multicast beamforming and power allocation. In [11], the authors
study D2D networks in which the content can be cached at either small base stations or user nodes. A
joint content replacement and delivering scheme is developed to reduce the total energy cost taking into
account the fading channels. In [12], success delivery rate is studied in cluster-centric networks, which
group small base stations (SBSs) into disjoint clusters. The SBSs within one cluster share a cache
which is divided into two parts: one contains the most popular files, and one comprises different files
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 47/56 Vol. 13, No. 1, January 2018
which are most popular locally. The authors in [13] study energy consumption based on an over
simplified model which assumes caching and transportation costs are linearly dependent on the number
of bits.
In this paper, we investigate the energy efficiency of content delivery networks in which a base station
(BS) is serving multiple users via a shared wireless channel. We focus on latency-tolerant applications
where the users can tolerate a reasonable delay before starting the requested service. First, we derive an
energy-buffering time trade-off (EBT) in a closed-form expression for single-user scenarios. From the
derived closed form, the impact of key system parameters on the total energy consumption is revealed.
We then formulate an optimization problem to minimize the total system energy usage for multiple-
user scenarios. In order to overcome the non-convexity of the formulated problem, we propose an
iterative algorithm which approximates the non-convex constraint by the first order approximation.
Finally, the effectiveness of the formulated problem is demonstrated via numerical results.
II. SYSTEM MODEL
We consider a content delivery network consisting of one BS equipped with L antennas serving K single-antenna users via a shared wireless medium, with K ≤ L, as depicted in Figure 1. The BS is connected to a data centre via high speed backhaul links. The BS is assumed to have full access to the content at the data centre, which contains N files of equal size of Q bits (in practice, unequal file size can be divided into trunks of subfiles which have the same size) and is denoted by F = {F1 , . . . , FN } the library. The users are equipped with a cache memory of size M (files) . We consider offline caching and focus on the energy consumption of the delivery phase [8].
A. Caching model
In this paper, we assume the content popularity follows a Zipf distribution [14]. The
probability of the i-th file being requested from a user is given as
where α is the skewness factor of the Zipf distribution.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 48/56 Vol. 13, No. 1, January 2018
In order to minimize the channel load, the users will cache the most popular files in their
cache. In particular, the first M
most popular files are prefetched at the user caches during the placement phase, which occurs
during off-peak time [1].
B. Signal transmission model
In the delivery phase, each user requests a file from the BS. First the user checks its own cache. If the requested file has been prefetched in its cache, it can be serve immediately. Otherwise, the requested file will be transmitted from the BS. Denote K’ as the subset of users whose requested files are not available in their cache. The BS will only transmit to these users in |K’ |. Obviously, |K’ | ≤ K .
We consider latency-tolerant applications, where the users can allow some buffering time after releasing their requests. Let θ denote a buffering time that the users can tolerate (the gap time between the moment the users send requests and when they can start the requested service, e.g., watching a video). Since the users can tolerate a buffering time θ, they will use this period to preload parts of the requested file to their buffer. Denote 𝐰𝑘
𝑏, 𝐰𝑘𝑡 ∈ CL×1 as the
precoding vector for user k during the buffering and transmission time, respectively. The received signal at user k is given as
where the superscript (b, t) represents the corresponding buffering time or transmission time, 𝑥𝑘 is the modulated signal of the requested file from user k, 𝑧𝑘 is Gaussian noise, and 𝐡𝑘 is the channel fading vector from the BS antennas to user k, which follows a circular-symmetric complex Gaussian distribution. Perfect channel state information (CSI) is assumed to be known at the BS. In practice, robust channel estimation can be achieved through the transmission of pilot sequences. We consider block fading channels and assume the channel coherence time is sufficient long to accommodate one request session [8]. By treating the interference as noise, the respective achievable information rate for user k ∈ K’ during the buffering time and transmission time are
where B is the channel bandwidth.
III. Problem Formulation
In this section, we consider an energy minimization problem with delay-tolerant design. The problem formulation is shown below. For more details, please refer to [10].
IV. NUMERICAL RESULTS
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 49/56 Vol. 13, No. 1, January 2018
This section presents numerical results to demonstrate the derived optimization. The system parameters for simulations are as follows: B = 1 MHz, κ = −20 dB, σ2 = −10 dBm, Q = 48 Mbits, and the request rate r1 = · · · = rK = r = 4 Mbps which is corresponding to the expected serving time T = Q/r = 12 seconds, Ptot = 2 Watt.
Fig. 2a presents the EBT for the single-user scenario without caching, i.e., M = 0. It is observed that the analysis perfectly
matches simulation results. If the user does not allow any delay, it costs 0.58 Joule to send the
requested file. However, if the user can tolerate a delay of 0.8 seconds, the system can save
10% of the energy cost. Fig. 2b plots the energy consumption in multi-user systems under
two precoding designs for two cases: without caching, i.e., M = 0 (left subfigure), and with
a cache size M = 0.1N (right subfigure). The energy consumption is calculated based on the
optimial solution of the formulated problems in Section IV. It is shown that the MMSE-based
design is more efficient than the ZF-based design in the considered setting. In particular, the
MMSE design consumes approximately 10% less than the ZF design. It is also shown that with
a cache size equal to 10% of the library size, the system can significantly reduce 75% the
total energy usage. In all cases, increasing the tolerated latency results in less energy
consumption. We would remark that the average energy cost per user in this case (left subfigure)
is higher than in the single-user scenario since additional energy is required to mitigate inter-
user interference.
V. CONCLUSION
We have analysed the energy performance of cache-assisted content delivery networks in
which a date centre is serving users via shared wireless channels. First, we have derived an
energy-buffering time trade-off in a closed-form expression for single-user scenarios. We then
have formulated two optimization problems corresponding two linear precoding design for
multi-user systems to minimize the total system energy consumption taking into account an
allowable latency. The developed framework can be utilized as a guideline for system design
and optimization for latency-tolerant services.
REFERENCES
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 50/56 Vol. 13, No. 1, January 2018
[1] S. Borst, V. Gupta, and A. Walid, “Distributed caching algorithms for content distribution networks,” in Proc. IEEE Int. Conf. Comput.
Commun., Mar. 2010, pp. 1–9.
[2] M. A. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEE Trans. Inf. Theory, vol. 60, no. 5, pp. 2856–2867, May 2014. [3] K. C. Almeroth and M. H. Ammar, “The use of multicast delivery to provide a scalable and interactive video-on-demand service,” IEEE J. Sel. Areas
Commun., vol. 14, no. 6, pp. 1110–1122, IEEE Trans. Inf. Theory. 1996. [4] M. Ji, G. Caire, and A. F. Molisch, “Fundamental limits of caching in wireless D2D networks,” IEEE Trans. Inf. Theory, vol. 62,
no. 2, pp. 849–869, Feb. 2016. [5] A. Sengupta, R. Tandon, and T. C. Clancy, “Fundamental limits of caching with secure delivery,” IEEE Trans. Info. Forensics and
Security, vol. 10, no. 2, pp. 355–370, Feb. 2015. [6] F. Gabry, V. Bioglio, and I.Land, “On energy-efficient edge caching in heterogeneous networks,” IEEE J. Sel. Areas Commun., vol. 34, no. 12, pp.
3288–3298, Dec. 2016. [7] D. Liu and C. Yang, “Energy efficiency of downlink networks with caching at base stations,” IEEE J. Sel. Areas Commun., vol. 34,
no. 4, pp. 907–922, Apr. 2016. [8] M. Tao, E. Chen, H. Zhou, and W. Yu, “Content-centric sparse multicast beamforming for cache-enabled cloud RAN,” IEEE Trans.
Wireless Commun., vol. 15, no. 9, pp. 6118–6131, Sept. 2016. [9] T. X. Vu, S. Chatzinotas, and B. Ottersten, “Energy-efficient design for edge-caching wireless networks: When is coded-caching beneficial?” in Proc.
IEEE Int. Workshop Signal Process. Adv. Wireless Commun., Jul. 2017, pp. 1–5. [10] T. X. Vu, L. Lei, and S. Vuppala, Energy-Efficient Design for Latency-tolerant Content Delivery Networks, in Proc. IEEE WCNC
Workshop, Barcelona, Apr. 2018, pp. 1-6. [11] M. Gregori, J. Gmez-Vilardeb, J. Matamoros, and D. Gndz, “Wireless content caching for small cell and D2D networks,” IEEE J. Sel.
Areas Commun., vol. 34, no. 5, pp. 1222–1234, May 2016. [12] Z. Chen, J. Lee, T. Q. Quek, and M. Kountouris, “Cooperative caching and transmission design in cluster-centric small cell networks,” IEEE Trans.
Wireless Commun., vol. 16, no. 5, pp. 3401 – 3415, May 2016. [13] Y. Xu, Y. Li, Z. Wang, T. Lin, G. Zhang, and S. Ci, “Coordinated caching model for minimizing energy consumption in radio access network,” in Proc.
IEEE Int. Conf. Commun., 2014, pp. 2406–2411. [14] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “Web caching and Zipf-like distributions: Evidence and implications,” in IEEE INFOCOM, Mar.
1999, vol. 1, pp. 126–134.
Thang X. Vu was born in Hai Duong, Vietnam. He received the B.S. and the M.Sc., both in Electronics and Telecommunications Engineering, from the VNU University of Engineering and Technology, Vietnam, in June 2007 and September 2009, respectively, and the Ph.D. in Electrical Engineering from the University Paris-Sud, France, in January 2014. From 2007 to 2009, he was with the Department of Electronics and Telecommunications, VNU University of Engineering and Technology, Vietnam as a research assistant. In 2010, he received the Allocation de Recherche fellowship to study Ph.D. in France. From September 2010 to May 2014, he was with the Laboratory of Signals and Systems (LSS), a joint laboratory of CNRS, CentraleSupelec and University Paris-Sud XI, France. From July 2014 to January 2016, he was postdoctoral researcher with the Singapore University of Technology and Design (SUTD), Singapore. Currently, he is research associate at Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg. His research interests are in the field of wireless communications, with particular interests of cache-assisted 5G, cloud radio access networks, resources allocation and optimization, cooperative diversity, channel and network decoding, and iterative decoding.
Lei Lei received the B.Eng. and M.Eng. degrees from Northwestern Polytechnical University, Xi’an, China, in 2008 and 2011, respectively. He obtained his Ph.D. degree in 2016 at the Department of Science and Technology, Linko ping University, Sweden. From Nov. 2016, he is a research associate at the Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg. He was a research assistant at Institute for Infocomm Research (I2 R), A*STAR, Singapore, from June 2013 to December 2013. He received the IEEE Sweden Vehicular Technology-Communications-Information Theory (VT-COM-IT) joint chapter best student journal paper award in 2014. His current research interests include resource allocation and optimization in 4G/5G/satellite networks, wireless caching, energy-efficient communications.
Satyanarayana Vuppala received the Bachelor of Tech. degree with distinction in Computer Science and Engineering from JNTU Kakinada, India, in 2009, and the Master of Tech. degree in Information Technology from the National Institute of Technology, Durgapur, India, in 2011. He received the Ph.D. degree in Electrical Engineering from Jacobs University Bremen in January 2015. He was a Postdoctoral at IDCOM, University of Edinburg, UK, from 2015 to 2017. Since May 2017, he is a post-doctoral researcher at the Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg. His research activities are mainly focused 5G Networks (Millimeter wave, Full-duplex, Non-orthogonal multiple access, D2D), Machine learning for Wireless Networks, and Internet of Things. He also works on physical, access, and network layer aspects of wireless security. He coauthored articles short-listed for the Best Paper Awards at the Asilomar Conference on Signals Systems and Computers in 2012, and 2014.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 51/56 Vol. 13, No. 1, January 2018
Cooperative Content Caching and Distribution in Multihop D2D-V2V Networks
Yahui Wang*, Zhenyu Zhou*, Houjian Yu*, and Chen Xu*
*School of Electrical and Electronic Engineering, North China Electric Power University,
Beijing, China.
1. Introduction
Device-to-device (D2D) communication, which allows direct content sharing over proximate peer-to-peer
links [1] and dependable vehicular connectivity [2]. And D2D based V2V (D2D-V2V) communication can
also realize effective data offloading, dependable service delivery and coordinated resource utilization by
exploring the cellular infrastructures with centralized intelligence [3].
A number of works have studied content distribution problems in conventional D2D networks including
relay networks [4], social networks [5], as well as mmWave cellular networks [6]. However, these works
are not suitable for the highly dynamic and unreliable D2D-V2V links. And as for works addressed the
content distribution problem in D2D based vehicular networks [7], [8], they have not consider the multi-
hop transmission scenario and vehicle trajectory prediction.
However, it imposes new challenges in D2D based vehicular content distribution. First, it is difficult to
form a content distribution group in fast-varying channel conditions and network topologies. Second, co-
channel interference should be carefully managed to satisfy the dependable timeliness requirements of
D2D-V2V communication. Thirdly, the multi-hop content distribution process involves a joint optimization
with peer discovery and spectrum allocation from a delay minimization perspective.
In this work, we investigate how to achieve dependable content distribution in D2D based cooperative
vehicular networks by combining big data based vehicle trajectory prediction with coalition formation game
based resource allocation, determine the formation of content distribution groups with different lifetimes
as a coalition formation game, and evaluate the delay performance based on real-world map and realistic
vehicular traffic by connecting SUMO with MATLAB via predefined standard interfaces.
2. System Model
Figure 1 The system model of D2D-V2V multihop content distribution
Figure 1 shows the system model of a D2D based cooperative vehicular network, which is composed of
a base station (BS), K cellular user equipments (CUEs), M vehicular content providers (V-TXs), and N
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 52/56 Vol. 13, No. 1, January 2018
vehicular content requesters (V-RXs). Vehicle mobility pattern and trajectory prediction have been
studied in [9]-[11]. We adopt a multi-Kalman filter (MKF) based trajectory prediction approach with the
assistance of global positioning system (GPS) and geographic information system (GIS) big data
proposed in [9] to estimate the connection time between two vehicles in transmission process. We assume
that each CUE k
VC is allocated with one orthogonal uplink resource block (RB) RB
kC , which can be
reused simultaneously by at most one D2D-V2V multicast transmission.
We consider an example that V-TX TX
mV serves the content request of V-RX
RX
nV by reusing RB RB
kC .
On account of uplink spectrum reusing, TX
mV will create co-channel interference to the BS, while RX
nV
will suffer from the interference caused by CUE k
VC . To evaluate the content distribution performance,
we take the network average delay as a key measurement, which can be expressed as a function of SINR
and vehicle connection time. The corresponding transmission rate is defined as ,
k
m nr , then the transmission
delay of D2D-V2V link (TX
mV , RX
nV ) using RB RB
kC can be approximately calculated as
,
,
,,( | )
k
m nk
m n k
m nm nt
%
% (1)
where ,
,
k
m nk
m n
D
r % and D represents the size of the required content in bits. ,,( | )
k
m nm nt % is an indicator
function of connection time ,m nt and it is defined as 1 when
,m nt > ,
k
m n% . ,,( | )k
m nm nt % makes sure that the
connection time of two vehicles should be no less than the duration required to deliver the content.
In each content distribution group, the content is delivered simultaneously from a serving V-TX to
multiple V-RXs co-located within the same group. During the modeling process of vehicular content
distribution, there are two critical aspects that should be carefully considered. First, the numbers of V-
TXs and V-RXs vary over time rather than remain constant. The number of potential V-TXs increases
gradually as more and more V-RXs obtain the content. Second, the lifetime of each D2D-V2V content
distribution group is different from one another due to the diverse channel conditions and interference
levels.
The delay ,
K
m nT for RX
nV to obtain the content from TX
mV is composed of the delay required for TX
mV to
obtain the content, and the transmission delay from TX
mV to RX
nV .
We design a M N K matrix M N K to represent the set of optimization variables. Each element
, ,m n ko of the matrix M N K is a binary variable. If TX
mV and RX
nV form a D2D-V2V pair by using RB
kC ,
, , 1m n ko , and otherwise, , , 0m n ko . The formulated joint peer discovery, spectrum allocation, and route
selection problem is given by
, ,
, , ,,{ }
1min
m n k RX TX RBn RX m TX k RB
K
m n k m no
V v V v C c
o TN
(2)
It is noteworthy that , min
k V
m n and min
m C
k needs to be satisfied to ensure QoS requirements for
cellular links and D2D-V2V links. In addition, all of the V-RXs in the same group are related to the same
V-TX and the same RB.
4. Coalition Formation Game based Dependable Content Distribution and Simulation Results
In this section, we introduce how to formulate the original content distribution problem as a coalition
formation game and some fundamental concepts. And the proposed algorithm is evaluated in simulation
based on real-world road topology and realistic vehicular traffic.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 53/56 Vol. 13, No. 1, January 2018
In a coalition formation game, a set of game players seek to form cooperative content distribution
groups with the aim to reduce average network delay. Here, the game formulation is defined as a triplet
(T, P, U), where T is the player set defined as TX RX RBv v c , P is a collection of coalitions, and U
denotes the coalition utility. Furthermore, P is also defined as any arbitrary set of disjoint coalitions
mS T . If P spans the player set T, P can also be regarded as a partition of T. Although coalitions are
formed to achieve dependable content distribution, there may exist some V-RXs not included in P
because of the QoS and connection time constraints. To make the definition of coalition consistent, we
introduce the concept of solo coalition {RX
nV }, which contains only the unserved V-RX RX
nV .
During a coalition game, each V-RX tends to join an ideal coalition to maximize its individual payoff.
The RB occupied by the coalition can be released for new coalition formation if and only if all of the V-
RXs within that coalition have received the requested content. Hence, the objective of a coalition is to
minimize the average delay of all the coalition members. As a result, a V-RX may be refused by a
coalition if it dramatically decreases the coalition utility.
After obtaining the requested content, a V-RX can act as V-TX and join a new coalition to serve other
V-RXs in the next hop. A new D2D-V2V coalition can only be formed if a RB is willing to join this
coalition. A conflict arises when multiple RBs tend to join the same coalition. In this case, only the RB
with the highest payoff is allowed to join the coalition.
Based on the concepts of preference relation and the split and merge rule, the coalition formation game
based vehicular content distribution is implemented as follows.
Phase 1: Coalition formation initialization
Phase 2: Iterative coalition formation
Phase 3: Resource allocation and content dissemination
The algorithm terminates if either one of the following conditions is satisfied. One is that any RX RX
nV V has obtained the requested content. The other is any V-RX that has not obtained the content
yet cannot be served by any TX TX
mV V .
(a) (b)
Figure 2 The percentage of served V-RXs and average network delay performance
The simulation of content distribution is conducted by connecting SUMO with MATLAB through the
standard traffic control interference (TraCI) protocol. We compare the proposed algorithm with two
heuristic schemes, i.e., a non-cooperative content distribution scheme [12] and a random group formation-
based content distribution scheme [12].
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 54/56 Vol. 13, No. 1, January 2018
Figure 2(a) shows the content distribution efficiency versus time. More rapid content distribution can be
achieved by the proposed algorithm during the beginning phases. In addition, the proposed algorithm
achieves better coverage performance when the content distribution process is finished since the route
selection, peer discovery, and spectrum allocation are jointly optimized in the proposed algorithms.
Figure 2(b) shows that the average network delay performance decreases monotonically with the number of
RBs. Adding more RBs can not only support numerous content distribution groups but also can introduce additional
diversity gain since there will be an increased opportunity for each group to select a better RB. Hence, the
performance gap demonstrates that the benefits brought by increasing the number of RBs can be better explored by
the proposed algorithm.
4. Conclusion
In this paper, we investigated the content distribution problem in D2D-based cooperative vehicular
networks and proposed a big data integrated coalition formation game approach to jointly optimize peer
discovery, route selection, and spectrum allocation from a delay minimization perspective. We conclude
that the proposed algorithm achieves the best content distribution efficiency and well explores the benefits
of adding more RBs. And it is more robust to the adverse impacts caused by multi-hop transmission.
References
[1] Z. Zhou, M. Dong, K. Ota, and C. Xu, “Energy-efficient matching for resource allocation in D2D enabled
cellular networks,” IEEE Trans. Veh.Technol., vol. 66, no. 6, pp. 5256–5268, June. 2017.
[2] X. Cheng, L. Yang, and X. Shen, “D2D for intelligent transportation systems: a feasibility study,” IEEE Trans.
Intell. Transp. Syst., vol. 16, no. 4, pp. 1784–1793, Aug. 2015.
[3] N. Cheng, H. Zhou, L. Lei, N. Zhang, Y. Zhou, X. Shen, and F. Bai,“Performance analysis of vehicular Device-
to-Device underlay communication,” IEEE Trans. Veh. Technol., vol. 66, no. 6, pp. 5409–5421, June. 2017.
[4] Y. Zhao and W. Song, “Truthful mechanisms for message dissemination via Device-to-Device
communications,” IEEE Trans. Veh. Technol., vol. pp, no. 99, pp. 1–1, July. 2017.
[5] C. Xu, C. Gao, Z. Zhou, Z. Chang, and Y. Jia, “Social network-based content delivery in Device-to-Device
underlay cellular networks using matching theory,” IEEE Access, vol. 5, pp. 924–937, Nov. 2016.
[6] N. Giatsoglou, K. Ntontin, E. Kartsakli, A. Antonopoulos, and C. Verikoukis, “D2D-aware device caching in
mmWave-cellular networks,” IEEE J. Sel. Areas Commun, vol. pp, no. 99, pp. 1–1, June. 2017.
[7] Z. Zhou, C. Gao, C. Xu, Y. Zhang, S. Mumtaz, and J. Rodriguez, “Social big data based content dissemination in
internet of vehicles,” IEEE Trans. Ind. Informat., vol. pp, no. 99, pp. 1–1, July. 2017.
[8] H. Li, B. Wang, Y. Song, and K. Ramamritham, “VeShare: a D2D infrastructure for real-time social-enabled
vehicle networks,” IEEE Wirel. Commun., vol. 23, no. 4, pp. 96–102, Aug. 2016.
[9] C. Barrios and Y. Motai, “Improving estimation of vehicles trajectory using the latest global positioning system
with Kalman filtering,” IEEE Trans. Instrum. Meas., vol. 60, no. 12, pp. 3747–3755, May. 2011.
[10] B. Barshan and H. F. Durrant-Whyte, “Inertial navigation systems for mobile robots,” IEEE Trans. Robot.
Autom., vol. 11, no. 3, pp. 328–342, June. 1995.
[11] R. Toledo, M. A. Zamora, B. Ubeda, and A. F. Gomez, “High integrity IMM-EKF based road vehicle
navigation with low cost GPS/INS,” IEEE Trans. Intell. Transp. Syst., vol. 8, no. 3, pp. 491–511, Sept. 2007.
[12] R. Mochaourab, E. Bjrnson, and M. Bengtsson, “Adaptive pilot clustering in heterogeneous massive MIMO
networks,” IEEE Trans. Wirel. Commun., vol. 15, no. 8, pp. 5555–5568, May. 2016.
Yahui Wang is currently pursuing the M.S. degree with North China Electric Power University,
China. Her research interests include resource allocation, interference management, and energy
management in D2D communications.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 55/56 Vol. 13, No. 1, January 2018
Zhenyu Zhou (M'11-SM'17) received his M.E. and Ph.D degree from Waseda University, Tokyo,
Japan in 2008 and 2011 respectively. From April 2012 to March 2013, he was the chief researcher
at Department of Technology, KDDI, Tokyo, Japan. From March 2013 to now, he is an Associate
Professor at School of Electrical and Electronic Engineering, North China Electric Power
University, China. He is also a visiting scholar with Tsinghua-Hitachi Joint Lab on Environment-
Harmonious ICT at University of Tsinghua, Beijing from 2014 to now. He served as an Associate
Editor for IEEE Access, and a Guest Editor for IEEE Communications Magazine and Transactions
on Emerging Telecommunications Technologies. He also served as workshop co-chair for IEEE ISADS 2015, and
TPC member for IEEE Globecom, IEEE CCNC, IEEE ICC, IEEE APCC, IEEE VTC, IEEE Africon, etc. He is a
voting member of P1932.1 Working Group. He was the recipient of the IEEE Vehicular Technology Society
"Young Researcher Encouragement Award" in 2009, the “Beijing Outstanding Young Talent Award in 2016, the
IET Premium Award in 2017, and the IEEE ComSoc Green Communications and Computing Technical Committee
2017 Best Paper Award. His research interests include green communications, vehicular communications, and smart
grid communications. He is a senior member of IEEE.
Houjian Yu is currently working towards the B.S. degree at North China Electric Power University,
China. His research interests include resource allocation, interference management, and energy
management in D2D communications.
Chen Xu (S’12-M’15) received the B.S. degree from Beijing University of Posts and
Telecommunications in 2010, and the Ph.D. degree from Peking University, Beijing, in 2015. She is
now a lecturer in School of Electrical and Electronic Engineering, North China Electric Power
University. Her research interests mainly include wireless resource allocation and management,
game theory, optimization theory, heterogeneous networks, and smart grid communication. She
served as a TPC member for IEEE Globecom, IEEE ICC, and IEEE ICCC. She received the best
paper award at the 2012 International Conference on Wireless Communications and Signal
Processing, and IEEE Leonard G. Abraham Prize in 2016.
IEEE COMSOC MMTC Communications - Frontiers
http://www.comsoc.org/~mmc/ 56/56 Vol. 13, No. 1, January 2018
MMTC OFFICERS (Term 2016 — 2018)
CHAIR STEERING COMMITTEE CHAIR
Shiwen Mao Zhu Li
Auburn University University of Missouri
USA USA
VICE CHAIRS
Sanjeev Mehrotra (North America) Fen Hou (Asia)
Microsoft University of Macau
USA China
Christian Timmerer (Europe) Honggang Wang (Letters&Member Communications)
Alpen-Adria-Universität Klagenfurt UMass Dartmouth
Austria USA
SECRETARY STANDARDS LIAISON
Wanqing Li Liang Zhou
University of Wollongong Nanjing Univ. of Posts & Telecommunications
Australia China
MMTC Communication-Frontier BOARD MEMBERS (Term 2016—2018)
Guosen Yue Director Huawei R&D USA USA
Danda Rawat Co-Director Howard University USA
Hantao Liu Co-Director Cardiff University UK
Dalei Wu Co-Director University of Tennessee USA
Zheng Chang Editor University of Jyväskylä Finland
Lei Chen Editor Georgia Southern University USA
Tasos Dagiuklas Editor London South Bank University UK
Melike Erol-Kantarci Editor Clarkson University USA
Kejie Lu Editor University of Puerto Rico at Mayagüez Puerto Rico
Nathalie Mitton Editor Inria Lille-Nord Europe France
Shaoen Wu Editor Ball State University USA
Kan Zheng Editor Beijing University of Posts & Telecommunications China