ieee internet of things journal, vol.xxx, no.xxx, month

14
IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 1 Sidelink-aided Multi-quality Tiled 360 Virtual Reality Video Multicast Jianmei Dai, Member, IEEE, Guosen Yue, Senior Member, IEEE, Shiwen Mao, Fellow, IEEE, and Danpu Liu, Senior Member, IEEE Abstract—Mobile/wireless virtual reality (VR) services, espe- cially immersive 360 VR videos, have advanced unprecedentedly in recent years. However, the high bandwidth requirement of VR services has compounded the burden on wireless networks. Multicast is a high potential technique for alleviating the bandwidth requirement of 360 VR video streaming, but the multicast capacity is still constrained by the users with poor channel conditions, and it vanishes when the number of users increases while the number of base station (BS) antennas is fixed. To overcome the drawbacks of multicast, sidelink, which is an adaptation of the core LTE standard that allows the device-to-device (D2D) communications without going through a BS, can be utilized. In this paper, two sidelink-aided multicast scenarios (i.e., independent decoding and joint decoding) are studied for multi-quality tiled 360 VR video transmission. We propose a utility model for each scenario, and quality level selection, sidelink sender/receiver selection, and transmission resource allocation are optimized to maximize the total utility of all users under the bandwidth constraints as well as the quality smoothness constraints for multi-quality tiles. We then develop an iterative two-stage algorithm to obtain sub-optimal solutions to the formulated Mixed-Integer Nonlinear Programming (MINLP) problems. Simulation results demonstrate the advantage of the proposed solutions over several baseline schemes. Index Terms—Virtual reality (VR), Tiled 360 video, Multicast, Sidelink, Resource allocation. I. I NTRODUCTION T HE last few years have witnessed an unprecedented proliferation of interest in both academia and industry in mobile/wireless virtual reality (VR) services. It is forecasted that the VR ecosystem will become an $80 billion market and a trillion-dollar industry by the year 2025 and 2035, respectively [1], [2]. The immersive panoramic (i.e., 360 ) VR video is an important form of media created by the conver- gence of advanced VR technology and fast video processing This work is supported by the National Natural Science Foundation of China under Grant No.61971069, 61801051, and the Beijing Natural Science Foundation under Grant No.L202003. S. Mao’s work is supported in part by the NSF under Grant ECCS-1923717. Jianmei Dai is with Space Engineering University and Beijing Laboratory of Advanced Information Network, Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommu- nications, Beijing 100876, China (e-mail:jammy [email protected]). Danpu Liu is with Beijing Laboratory of Advanced Information Network, Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China (e-mail: [email protected]). Guosen Yue is with Futurewei Technologies, Bridgewater, NJ, USA (e-mail: [email protected]). Shiwen Mao is with the Department of Electrical and Computer En- gineering, Auburn University, Auburn, AL 36849-5201, USA (e-mail: [email protected]). DOI: 10.1109/JIOT.2021.3105100 hardware, which can provide a 360 view angle to enable an omni-directional viewport of the scene [3], [4]. However, delivering an entire 360 VR video incurs four to five times higher bandwidth requirements than that of a traditional video, since the general 360 VR video data rate at the 4K resolution is 50Mbps and the data rate with 8K resolution increases to 200Mbps [5]. To alleviate the burden of wireless networks, an effective method is to transmit user’s field-of-view (FoV) data instead of the entire 360 VR video. The FoV is defined as the portion of the 360 video that is in the user’s line of sight, and an FoV can be spatially divided into smaller parts referred to as “tiles,” which can be encoded into multiple versions, each at a different quality level [6]–[9]. Bandwidth consumption can be reduced by only sending the tiles in the FoV in high resolution, while sending other tiles in lower resolutions or not at all [10]. Multicast is another effective technique for decreasing the bandwidth requirement of 360 VR video streaming. In such an application, users wear an individual head-mounted device (HMD) to watch a 360 video, and the number of concurrent transmissions for the same content will be much larger than that of conventional videos, even in a small geographical area [11]. By sharing spectrum with many users requesting the same video segments or tiles, multicast is highly promising to mitigate the high demand on network resources [12]–[17]. However, the multicast capacity is limited by the users in poor channel conditions and vanishes when the number of users increases due to the fixed number of base station (BS) antennas [18], [19]. To overcome this issue, Device to Device (D2D) communications can be used as an auxiliary means to improve the multicast performance. The D2D communication technology in Long Term Evolution (LTE) was introduced by the 3rd Generation Partnership Program (3GPP) to allow devices in proximity to communicate directly with each other. Therefore, D2D communications can effectively alleviate the burden of the BS and reduce the transmission delay. In order to support LTE D2D, 3GPP defined the PC5 interface, a new direct link between user equipments (UEs) termed “sidelink” in the access stratum layer. UEs can utilize the sidelink to exchange information when they are in close proximity. As a new paradigm underlying cellular networks, sidelink commu- nications have been standardized to enhance the performance of cellular networks and support vehicle-to-everything (V2X) communications, which enable a vehicle to collaborate with other vehicles, devices, and infrastructure. Several enhanced multicast schemes for traditional services aided by sidelink have been proposed in the literature [20]–

Upload: others

Post on 23-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 1

Sidelink-aided Multi-quality Tiled 360◦ VirtualReality Video Multicast

Jianmei Dai, Member, IEEE, Guosen Yue, Senior Member, IEEE, Shiwen Mao, Fellow, IEEE, andDanpu Liu, Senior Member, IEEE

Abstract—Mobile/wireless virtual reality (VR) services, espe-cially immersive 360◦ VR videos, have advanced unprecedentedlyin recent years. However, the high bandwidth requirement ofVR services has compounded the burden on wireless networks.Multicast is a high potential technique for alleviating thebandwidth requirement of 360◦ VR video streaming, but themulticast capacity is still constrained by the users with poorchannel conditions, and it vanishes when the number of usersincreases while the number of base station (BS) antennas isfixed. To overcome the drawbacks of multicast, sidelink, whichis an adaptation of the core LTE standard that allows thedevice-to-device (D2D) communications without going througha BS, can be utilized. In this paper, two sidelink-aided multicastscenarios (i.e., independent decoding and joint decoding) arestudied for multi-quality tiled 360◦ VR video transmission. Wepropose a utility model for each scenario, and quality levelselection, sidelink sender/receiver selection, and transmissionresource allocation are optimized to maximize the total utility ofall users under the bandwidth constraints as well as the qualitysmoothness constraints for multi-quality tiles. We then develop aniterative two-stage algorithm to obtain sub-optimal solutions tothe formulated Mixed-Integer Nonlinear Programming (MINLP)problems. Simulation results demonstrate the advantage of theproposed solutions over several baseline schemes.

Index Terms—Virtual reality (VR), Tiled 360◦ video, Multicast,Sidelink, Resource allocation.

I. INTRODUCTION

THE last few years have witnessed an unprecedentedproliferation of interest in both academia and industry in

mobile/wireless virtual reality (VR) services. It is forecastedthat the VR ecosystem will become an $80 billion marketand a trillion-dollar industry by the year 2025 and 2035,respectively [1], [2]. The immersive panoramic (i.e., 360◦) VRvideo is an important form of media created by the conver-gence of advanced VR technology and fast video processing

This work is supported by the National Natural Science Foundation ofChina under Grant No.61971069, 61801051, and the Beijing Natural ScienceFoundation under Grant No.L202003. S. Mao’s work is supported in part bythe NSF under Grant ECCS-1923717.

Jianmei Dai is with Space Engineering University and Beijing Laboratory ofAdvanced Information Network, Beijing Key Laboratory of Network SystemArchitecture and Convergence, Beijing University of Posts and Telecommu-nications, Beijing 100876, China (e-mail:jammy [email protected]).

Danpu Liu is with Beijing Laboratory of Advanced Information Network,Beijing Key Laboratory of Network System Architecture and Convergence,Beijing University of Posts and Telecommunications, Beijing 100876, China(e-mail: [email protected]).

Guosen Yue is with Futurewei Technologies, Bridgewater, NJ, USA (e-mail:[email protected]).

Shiwen Mao is with the Department of Electrical and Computer En-gineering, Auburn University, Auburn, AL 36849-5201, USA (e-mail:[email protected]).

DOI: 10.1109/JIOT.2021.3105100

hardware, which can provide a 360◦ view angle to enablean omni-directional viewport of the scene [3], [4]. However,delivering an entire 360◦ VR video incurs four to five timeshigher bandwidth requirements than that of a traditional video,since the general 360◦ VR video data rate at the 4K resolutionis 50Mbps and the data rate with 8K resolution increases to200Mbps [5].

To alleviate the burden of wireless networks, an effectivemethod is to transmit user’s field-of-view (FoV) data insteadof the entire 360◦ VR video. The FoV is defined as the portionof the 360◦ video that is in the user’s line of sight, and anFoV can be spatially divided into smaller parts referred to as“tiles,” which can be encoded into multiple versions, each at adifferent quality level [6]–[9]. Bandwidth consumption can bereduced by only sending the tiles in the FoV in high resolution,while sending other tiles in lower resolutions or not at all [10].

Multicast is another effective technique for decreasing thebandwidth requirement of 360◦ VR video streaming. In suchan application, users wear an individual head-mounted device(HMD) to watch a 360◦ video, and the number of concurrenttransmissions for the same content will be much larger thanthat of conventional videos, even in a small geographicalarea [11]. By sharing spectrum with many users requesting thesame video segments or tiles, multicast is highly promising tomitigate the high demand on network resources [12]–[17].

However, the multicast capacity is limited by the users inpoor channel conditions and vanishes when the number ofusers increases due to the fixed number of base station (BS)antennas [18], [19]. To overcome this issue, Device to Device(D2D) communications can be used as an auxiliary means toimprove the multicast performance. The D2D communicationtechnology in Long Term Evolution (LTE) was introducedby the 3rd Generation Partnership Program (3GPP) to allowdevices in proximity to communicate directly with each other.Therefore, D2D communications can effectively alleviate theburden of the BS and reduce the transmission delay. In orderto support LTE D2D, 3GPP defined the PC5 interface, a newdirect link between user equipments (UEs) termed “sidelink”in the access stratum layer. UEs can utilize the sidelink toexchange information when they are in close proximity. As anew paradigm underlying cellular networks, sidelink commu-nications have been standardized to enhance the performanceof cellular networks and support vehicle-to-everything (V2X)communications, which enable a vehicle to collaborate withother vehicles, devices, and infrastructure.

Several enhanced multicast schemes for traditional servicesaided by sidelink have been proposed in the literature [20]–

Page 2: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 2

[26]. The idea is to let the BS focus its transmission towardsa suitably selected group of users with good channel quality,while leaving a few others in outage [27], [28]. The sidelinksare enabled to allow the users with unfavorable channelconditions to be served by their peers. The overall transmissioncapacity can be thus improved. However, these schemes aredesigned for traditional services, and may not be suitable for360◦ VR videos since many unique features and challenges ofVR video streaming are not considered. For example, unlikecommon videos, different 360◦ VR frame tiles are of differentlevels of importance to the viewer experience, depending ontheir spatial positions in the user’s FoV. Apparently, the tileswhich are in the center of the FoV are more important andshould be delivered with a higher resolution, while the othersmay have a reduced resolution to save bandwidth and power.Furthermore, the tile smoothness should be considered inthe design of VR transmission systems, in other words, thedifference of quality between adjacent VR video tiles shouldbe bounded to ensure a good viewer experience. Generally,the smaller the boundary, the higher the user’s Quality ofExperience (QoE).

In this work, we study the problem of sidelink-aided mul-ticast optimization for 360◦ VR videos. Inspired by severalprior works [12], [13], [16], we aim to design a sidelink-aidedmulticast solution for multi-quality tiled 360◦ VR videos.Unlike these prior works, our solution is based on the jointconsideration of multicast downlinks and sidelinks. In thissystem, not all the users are served by the BS multicastlink; some users with poor channel conditions are selectedto be served by sidelinks. Under such a circumstance, tworesulting different decoding scenarios, i.e., an independentdecoding scenario and a joint decoding scenario, are modeledand analyzed. For each scenario, we optimize the tile qualitylevel selection, transmission resource allocation, and sidelinksender/receiver selection to maximize the total utility of allusers under the transmission resource constraints as well as thequality smoothness constraints for mixed-quality tiles. Bothproblems are challenging Mixed-Integer Nonlinear Program-ming (MINLP) problems. We propose a novel iterative two-stage algorithm to obtain sub-optimal, yet highly competitivesolutions, using greedy search and continuous relaxation. Tothe best of our knowledge, such effective design has not yetbeen analytically verified in the existing literature.

In summary, the novelty and technical contributions of thiswork are provided as follows.• We propose a sidelink-aided multicast system for multi-

quality tiled 360◦ VR video streaming from one BS tomultiple users, where both BS-user multicast link andsidelink are jointly utilized.

• We describe two sidelink-aided multicast scenarios, theindependent decoding scenario and the joint decodingscenario, and formulate a utility maximization problemfor each scenario under the transmission resource and tilequality smoothness constraints via optimized quality levelselection, transmission resource allocation, and sidelinksender/receiver selection.

• To tackle the NP-hardness of the formulated Mixed-Integer Nonlinear Programming (MINLP) problems, we

propose an iterative two-stage algorithm to obtain asuboptimal solution, using greedy search and continuousrelaxation, which exhibits super fast convergence perfor-mance in our evaluations.

• Our simulation results demonstrate that the proposed sub-optimal solutions can provide significantly better utilityperformance than several baseline schemes, which effec-tively enhance the user QoE compared to conventionalmulticasting strategies and the baseline schemes.

The remainder of this paper is organized as follows. Sec-tion II discusses the related work. Our system model andproblem statement are presented in Section III and Section IV,respectively. In Section V and Section VI, the proposed two-stage algorithms for the two scenarios and their simulationperformance validation are presented, respectively. Finally,Section VII concludes this paper.

II. RELATED WORK

Multicast is an up-and-coming solution to deal with thehigh network requirements of VR. Two optimal multicasttransmission schemes for tiled 360◦ VR video were stud-ied in [12] and [13], respectively. One was to maximizethe received video quality in orthogonal frequency divisionmultiple access (OFDMA) systems by optimizing subcarrier,transmission power, and transmission rate allocation [12].The other was to minimize the average transmission energyby optimizing transmission time and power allocation [13].In [14], the authors proposed a multicast DASH-based tiledstreaming solution, including a user’s viewports based tileweighting approach and a rate adaptation algorithm, to pro-vide an immersive experience for VR users. View synthesismulticast was analyzed in [15]. Further, the authors in [16]optimized the VR video quality level selection, transmissiontime allocation, and transmission power allocation to max-imize the utility under transmission time and power allo-cation constraints as well as quality smoothness constraintsfor mixed-quality tiles. A Multi-Session Multicasting (MSM)system for multi-quality tiled-video was proposed in [17],where user grouping, wireless-resource allocation, and tiled-video rate selection were jointly optimized.

Introducing D2D communications to the traditional mul-ticast schemes can not only relieve the burden of the BSbut greatly enhance resource utilization. Aiming to maxi-mize the system throughput, a two-stage multicast schemewas investigated in [20], and the system throughput couldbe improved with the help of mobile relays. The idea ofemploying network controlled reliable multicasting for D2Dcommunications was explored in [21], which was a scalableand efficient solution for file transfer and video streaming. TheD2D assisted cooperative multicast was investigated in [22]and the optimal time allocation between the multicast andretransmission stages was derived. A mixed infrastructure-to-device (I2D) multicast and D2D-relaying offloading schemebased on reinforcement learning was proposed in [23], aimingto reduce the redundant traffic while guaranteeing timelydelivery. Multi-hop D2D communication was studied in [24],with the objective of improving the multimedia multicast

Page 3: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 3

Level L

...Level l

...

Core NetworkCore Network

360° VR

Video Server

360° VR

Video ServerBackhaulBackhaul

Multicast Receiver

BS

Fair Downlink Channel

Poor Downlink Channel

Good Downlink Channel

Good Sidelink Channel

Fair Sidelink Channel

Poor Sidelink Channel

Level 1(1,1) (1,2) ... (1,N)

... ... (m,n) ...

(M,1) (M,2) ... (M,N)

Level 1Level(1,1)

11

(1,2) ... (1,N)( , )

...

( , )

... (m,n)

( , )

))) ...

(M,1))))))))) (M,2)

, )((

)))))) ...

))

(M,N)

Different Groups

Column

Row

Quality

Column

Row

Quality

Multicast ReceiverSidelink sender

Sidelink receiverSidelink receiver

Fig. 1. Illustration of the sidelink-aided multicast system for multi-qualitytiled 360◦ VR video. Each 360◦ VR video segment is partitioned into a gridof M × N non-overlapping rectangular tiles of the same size, and each tilehas multiple quality representations. A 360◦ VR video server providing videostreams connects to a base station (BS) through high-speed wired links. TheBS equipped with one transmit antenna aims to announce streaming servicesand deliver VR video tiles of different qualities to multiple users within itsrange, wearing single-antenna, head-mounted displays (HMDs). Different tilesat various quality levels can be transmitted to users by the BS via the multicastdownlink or by a selected user (sidelink sender) via sidelink.

sessions and transmissions in terms of throughput and meandownload time per client. A simple computationally efficientscheme was proposed for the D2D aided multicast scenario,in which only statistical channel knowledge at the transmitterwas required [25]. An enhanced scheme for selecting a subsetof UEs who cooperate to spread the common message acrossthe rest of the network via D2D retransmissions was proposedin the new light of precoding capabilities at the BS [26].

However, these existing schemes do not consider the mixed-quality and the smoothness of VR tiles, and thus may notachieve a good performance for VR streaming.

III. SYSTEM MODEL

A. Network Model and Problem Statement

A sidelink-assisted live streaming of popular 360◦ videosover 4G/5G cellular networks that support multicast services,such as Evolved Multimedia Broadcast and Multicast Services(eMBMS) in LTE networks [29], is considered in this paper.The system architecture is illustrated in Fig. 1 and the notationused in this paper is summarized in Table I.

In our system model, the 360◦ VR video segment is dividedinto a grid of M×N smaller non-overlapping rectangular tilesof the same size, where M and N represent the numbers oftiles in each column and each row, respectively. The (m,n)-th (1 ≤ m ≤ M and 1 ≤ n ≤ N ) tile refers to thetile in the m-th row and n-th column. Similar to regularmobile multimedia streaming and the model in [30], a 360◦

VR video server providing video streams is connected to aBS through high-speed wired links. The BS, equipped withone transmit antenna, aims at announcing streaming servicesand delivering VR video tiles of different qualities to aset U ∆

= {1, 2, ..., u, ..., |U|} of users wearing head-mounteddisplays (HMDs), that are within its coverage.

The users who request the same tiles forms one multi-cast group g, and there are at most |G| groups for all the

TABLE INOTATION

Symbol Description

b The maximum encoding rate of all the tilequality levels

G, |G|, g User group set, the maximum group numbers,g-th group

L, L, l Quality level set, the best quality level, andthe l-th quality level

M,N , (m,n) Tile numbers in each row and column, the(m,n)-th tile, 1 ≤ m ≤M, 1 ≤ n ≤ N

U , |U|, u Total user set, total user numbers, user indexUg , |Ug| User set of group g,user numbers of UgJg, |Jg|, jg Sidelink receiver set in Ug , sidelink receiver

numbers of Jg , sidelink receiver indexKg, |Kg|, kg Multicast user set in Ug , multicast user

numbers of Kg , user index in Kg

BDL, BDLg Overall bandwidth of the downlink of BS, the

downlink bandwidth allocated for group gBSL, BSL

g Overall bandwidth of the sidelink,the allocated sidelink bandwidth for group g

CDLg , CSL

g Multicast and sidelink capacity of group gPBS, PU Power of BS and userWm,n Overall weight of the (m,n)-th tileW global

m,n ,WFoVm,n Long term and FoV weight of the (m,n)-th

tilexm,n, ym,n Tile’s quality level selected by BS and

transcoded by sidelink sender∆ Quality difference between two adjacent tilesΦ,Φg Tiles set transmitted to all the user groups,

and the tiles set transmitted to user group g

users, where |G| ≤ min (|U|,M ×N). We denote G ={1, 2, ..., g, ..., |G|}, Ug , and |Ug| as the set of user groups,the user set in group g, and the number of users in group g,respectively. The special multicast groups where |Ug| = 1 (i.e.,the tiles are requested only by one user) are beyond the scopeof this paper, since the sidelink cannot be utilized in this case.

Tiles in each group are independently encoded, meaningthat the user can either decode them or fail to decode themvia the multicast downlink channel. We denote subset Kg

(with |Kg| users) as the users who can receive and decodeVR video tiles successfully via the multicast downlink, whilethe other subset Jg with |Jg| users can only decode the samevideo tiles correctly with the help of a sidelink sender. Thesidelink sender is a selected user in Kg who has successfullydecoded the video tiles and can help users in Jg . We haveKg ∪Jg = Ug , Kg ∩Jg = ∅, |Jg| = |Ug|− |Kg|, and Ug ⊆ U .In this context, based on the average channel qualities, e.g.,the wideband channel quality indicator (CQI) or the averagesignal-to-noise ratio (SNR), or the average rate of the sidelinks,which are reported by the users to the BS, the BS can allocateits radio resources to a selected subset of users of each groupwith optimized Kg , and one of which in turn can cooperate tospread the video data across the rest of the network.

The following two-phase scheme [25], [26], [31], [32] hasbeen adopted for data transmissions, taking two time slots.

1) Phase 1: the BS conveys the video tiles using orthogonalmulticast transmissions to users;

2) Phase 2: A sidelink sender is selected for each group

Page 4: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 4

for sidelink assisted transmissions. Since the frequenciesof multicast downlink and sidelink are different, thesidelink sender can receive new frames/packets fromthe BS while transmitting decoded frames/packets onthe sidelink in parallel. Assuming channel status doesnot change drastically in a short period of time (i.e.,coherence time), then mathematically, we can assumethat the two transmissions are simultaneous withoutviolating the causal status.

Specifically, we consider that the sidelink receivers candecode the tiles independently or jointly in Phase 2, whichare described in the following.

1) The Independent Decoding Scenario (hereinafter re-ferred to as Scenario InD): Scenario InD is illustrated inFig. 2. The sidelink sender in Kg decodes the tile, transcodesit (if needed), and broadcasts the tile to all the other usersin Jg , who receive the tile with the same quality or with adegraded quality. We use xm,n and ym,n to denote the originaltile quality level selected by the BS and the transcoded tilequality level transmitted by the sidelink sender, respectively.We then have

xm,n ≥ ym,n, xm,n ∈ L, ym,n ∈ L, (m,n) ∈ Φ, (1)

where L ∆= {1, 2, ..., l, ..., L} is the set of quality represen-

tation for each tile, considering the heterogeneous channelconditions of different users and the limited transmissionresource.

Video transcoding is the process of converting one encodingformat into another. This is necessary when the networkconditions between a sidelink sender and sidelink receiversare worse than the network conditions between the BS andmulticast receivers. Considering the latency-sensitive natureof VR services, conventional video transcoding methods, e.g.,FFmpeg based transcoding [33], are not feasible for thissystem since they do not operate in realtime. In [34], adistributed transcoder system for tiled streaming of ultra-high resolution 360◦ VR video was proposed, which has theadvantages of realtime transcoding of 16K videos. Therefore,with abundant computing capability, a realtime transcoder canbe deployed at the sidelink sender. As shown in Fig. 2(b),the video transcoding process is divided into two phases: (i)decoding the encoded data of quality level xm,n into theoriginal tiles, (ii) re-encoding these tiles at a lower qualitylevel ym,n.

2) Joint Decoding Scenario (hereinafter referred to asScenario JnD): In this scenario, we assume the theoreticaloptimal decoding at the physical layer based on the receivedsignals in two phases. Hence we apply the information theo-retic results, i.e., the achievable rate for the same informationthrough parallel Gaussian channels at different SNR levels(due to different channel gains).

The data process and time schedule of Scenario JnD areillustrated in Fig. 3(a) and Fig. 3(b). As shown in Fig. 3(a),the sidelink sender k∗g , which has decoded the tile, broadcaststhe same information with additional coded bits (i.e., with in-cremental redundancy) via the sidelink. As shown in Fig. 3(b),the users in Jg receive and perform joint decoding on both

received signals (i.e., from both the downlink and sidelink) todecode the tile with the same quality. We denote by zm,n thetile quality level both conveyed by the BS and the sidelinksender in this joint decoding scenario. Then we have

zm,n ∈ L, (m,n) ∈ Φ. (2)

The main objective of this paper is to determine whichqualities of the required tiles among the groups should beselected and transmitted by the BS and the sidelink senders,aiming to maximize the overall video quality of all the users.

B. Transmission ModelDifferent from [35], in the considered system, the multicast

downlink and broadcast sidelink are on different frequencychannels; thus the interference between the downlink andsidelink is negligible. We denote by BDL and BDL

g the overallmulticast downlink bandwidth available at the BS and thebandwidth allocated for user group g, respectively. We allocateBDL to all the groups, given by∑

g∈|G|

BDLg = BDL, BDL

g > 0. (3)

Let BSL and BSLg be the overall bandwidth for the sidelinks

and the bandwidth allocated for user group g, respectively.Similar to (3), we also have∑

g∈|G|

BSLg = BSL, BSL

g > 0. (4)

We consider shadow fading PLS and Rayleigh fading PLR

with free space path loss in this model, which follow thelog-normal distribution and complex Gaussian distribution,respectively. For user u ∈ U , the free space path loss iscalculated as

PL(x, u) = 32.4 + 20 lg(fx)(GHz) + 20 lg(dx,u)(m), (5)

where x can be either the BS or a user u′ (u′ 6= u, u′ ∈ U),and fx and dx,u denote the center frequency of x and distancebetween x and u, respectively. The received power of user ucan be expressed as

P (x, u) = Px +Gx +GU − PL(x, u)− PLS − PLR, (6)

where Gx and Px denote the transmit antenna gain and thetransmit power of the BS, PBS (or the transmit power of userdevice, PU), respectively. For simplicity, we assume that all theuser devices have the same antenna gain and transmit power.

1) Downlink and Multicast Capacity: We denoteSINRDL(BDL

g , kg) the average signal to interferenceplus noise ratio (SINR) for user kg on the downlink, whichis given by

SINRDL(BDLg , kg) =

P (BS, kg)

N0BDLg +

∑u′∈U/kg

P (u′, kg), (7)

where N0 is the one-sided power spectral density of whiteGaussian noise.

Subsequently, the downlink capacity between user kg andthe BS is given by

CDL(BDLg , kg) = BDL

g log2

(1 + SINRDL(BDL

g , kg)). (8)

Page 5: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 5

BSBS

SL Sender

Multicast

(m,n)

Tile Quality i

i ≤ l

(m,n)

Tile Quality l

Transcode

RetransmitSL Broadcast

Decode

(m,n)

Tile Quality l

Decode

(m,n)

Tile Quality i

i ≤ l

SL Receivers

(a) The data process

Time Slot 1 Time Slot 2

BSBS

SL Sender

SL Receivers

Multicast Video

Data 1 by

Downlink

Receive Video

Data 1

Decode (and

Transcode if

needed)

Video Data 1

Receive and Decode

the (Transcoded)

Video Data 1

Multicast Video

Data 2 by

Downlink

Receive Video

Data 2

Retransmit

(Transcoded)

Video Data 1 by

Sidelink

(b) Time schedule for the BS and users

Fig. 2. Illustration of Scenario InD: The sidelink sender that decodes the tile transcodes it (if needed) and broadcasts it at the same quality or a degradedquality to the sidelink receivers.

BSBS

Multicast

(m,n)

Tile Quality l

Incremental

redundancy

RetransmitSL

Broadcast

(m,n)Tile Quality l(m,n)

Tile Quality l

RJoint

Decode

Multicast

(m,n)

Tile Quality l

Decode

(m,n)

Tile Quality l

SL Sender SL Receivers

(a) Data Process

Time Slot 1 Time Slot 2

BSBS

SL Receivers

Multicast Video

Data 1 by

Downlink

Receive Video

Data 1

Decode

Video Data 1

Receive the

Video Data 1

with redundancy

Bits by Sidelink

Multicast Video

Data 2 by

Downlink

Receive Video

Data 2

Retransmit

Video Data 1

with redundancy

Bits by Sidelink

Receive and

Cache Video

Data 1 by

Downlink

add

redundancy

bits

Receive and

Cache Video

Data 2

Joint Decode the

Video Data 1

from Downlink

and Sidelink

SL SenderSL Sender

(b) Time Schedule for BS and Users

Fig. 3. Illustration of Scenario JnD. The sidelink sender broadcasts the same quality tile with additional coded bits to the sidelink receivers who performjoint decoding of the signals from both the downlink and sidelink.

The multicast downlink capacity CDLg of group g can be

described as follows, since it is limited by the users with poorchannel conditions.

CDLg = min

kg∈Kg

CDL(BDLg , kg). (9)

2) Sidelink Capacity: In a similar way, the capacity be-tween user kg and user jg on the sidelink is denoted byCSL(BSL

g , kg, jg), which can be derived as

CSL(BSL

g , kg, jg)

= BSLg log2

(1 + SINRSL

(BSL

g , kg, jg)),

(10)where SINRSL is the average SINR for user jg on the sidelink,and can be expressed as

SINRSL(BSL

g , kg, jg)

=P (kg, jg)

N0BSLg +

∑u′∈U/kg,jg

P (u′, jg).

(11)Correspondingly, the sidelink capacity CSL

g in groupg is limited by the best channel condition amongminjg∈Jg C

SL(BSL

g , kg, jg), which can be written as

CSLg = max

kg∈Kg

minjg∈Jg

CSL(BSLg , kg, jg). (12)

C. 360◦ VR Video ModelGenerally, a user watching a 360◦ VR video is interested in

only one FoV at a time, and different users may be watching

different FoVs at the same time. Following [36], we assumethat the FoV always contains complete tiles exactly; if part ofa tile is in the FoV, the entire tile will be included in the FoV.Therefore, there are always an integer number of tiles in theFoV. We denote by Φg the set of indices of tiles transmitted touser group g, and Φ ,

⋃g∈G Φg indicates that the user group

is formed following the tile groups.

To maintain a high user QoE, we assume that any two adja-cent tiles have similar quality levels (i.e., no abrupt changes).Therefore, the tile spatial smoothness is introduced in thismodel. In addition, considering the different popularity ofvideo tiles, weights are introduced to represent tile popularity.

1) Tile Spatial Smoothness: Following [16], the quality dif-ference between two adjacent tiles is bounded by a parameter∆ ∈ L∪{0}. In addition, considering that the first column ofa tile is adjacent to the last column of the previous tile, andthe first row of a tile is adjacent to the last row of the tileabove, we have∣∣xm,n − xm,(n+1) mod N

∣∣ ≤ ∆,

for (m,n) ∈ Φ, (m, (n+ 1) mod N) ∈ Φ (13)∣∣xm,n − x(m+1) mod M,n

∣∣ ≤ ∆

for (m,n) ∈ Φ, ((m+ 1) mod M,n) ∈ Φ (14)

Page 6: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 6

∣∣ym,n − ym,(n+1) mod N

∣∣ ≤ ∆

for (m,n) ∈ Φ, (m, (n+ 1) mod N) ∈ Φ (15)∣∣ym,n − y(m+1) mod M,n

∣∣ ≤ ∆

for (m,n) ∈ Φ, ((m+ 1) mod M,n) ∈ Φ (16)∣∣zm,n − zm,(n+1) mod N

∣∣ ≤ ∆

for (m,n) ∈ Φ, (m, (n+ 1) mod N) ∈ Φ (17)∣∣zm,n − z(m+1) mod M,n

∣∣ ≤ ∆

for (m,n) ∈ Φ, ((m+ 1) mod M,n) ∈ Φ. (18)

2) Tile Weight: Tile weight represents the degree of aspatial region in video segments that attracts attention of users(i.e., popularity). Consider that tiles with higher weight scoresrepresent regions with more attractive textures or objects,higher quality level should be used by the BS and the sidelinksender for such tiles.

In our model, the weight of a tile consists of two parts.From the long term perspective, the more the requesting users,the higher the weight. We denote by W global

m,n the long termweight of the (m,n)-th tile, which follows the N (0, 1) normaldistribution [36]. Usually the tiles at the same (m,n) positionin different segments have different weights. On the otherhand, for a tile in a user’s current FoV, its weight should takeinto account its position in the FoV [37], [38]: the tile shouldhave a higher weight when it is closer to the center of the FoV.We denote by WFoV

m,n such short term weight of the (m,n)-thtile. Noted that users may have their own interest in viewingvideos, and the same tile can be at either the center or marginin different users’ FoVs. Thus WFoV

m,n will be an arithmeticaverage weight of all the concurrent FoVs.

The overall weight of tile Wm,n is thus defined as theproduct of W global

m,n and WFoVm,n , given by

Wm,n = W globalm,n ·WFoV

m,n . (19)

D. Utility Model

The most popular metric to quantify the video QoE is thelogarithmic law definition [39]–[41], which is specified as

Q = α log

(βrir

), (20)

where ri and r denote the actual video rate and requested(maximum) video rate, respectively. The constant parametersα and β are both positive coefficients to normalize the utilityQ to remain between 0 and 1, and they can be empiricallydetermined for different applications and videos.

Similarly, we introduce the VR video utility model basedon a function of video tile qualities, since the tile quality levelis in direct proportion to video rate. Inspired by the modelsin [39]–[41], we denote Υind and Υjnd as the VR video tile

utility of Scenario InD and Scenario JnD, respectively, as

Υind(m,n) = Wm,nα log

(β|Kg|xm,n+(|Ug|− |Kg|)ym,n

|Ug|L

),

(21)

Υjnd(m,n) = Wm,nα log(βzm,n

L

). (22)

IV. PROBLEM FORMULATION

In this paper, we aim to achieve the maximum sum qualityfor all the tiles requested by all user groups by optimizingthe quality levels xm,n, ym,n, and zm,n, the multicast receiverusers |Kg| (i.e., the user with the worst wireless condition,kg,min), the sidelink sender k∗g , and the transmission resourceallocation BDL

g and BSLg . We formulate the optimization

problems in this section.For Scenario InD, the mathematical formulation of the

problem is given by

max{xm,n},{ym,n},{|Kg|},

{k∗g},{BDLg },{B

SLg }

Pind =1

|G|∑g∈|G|

∑(m,n)∈Φg

Υind(m,n)

(23a)

s.t. b∑

(m,n)∈Φg

xm,n ≤ CDLg , ∀g ∈ G (23b)

b∑

(m,n)∈Φg

ym,n ≤ CSLg , ∀g ∈ G (23c)

(1), (3), (4), (13), (14), (15), (16). (23d)

The constraints can be explained as follows. Con-straints (23b) and (23c) indicate that the overall data size ofrequested tiles for one user group should not exceed the ca-pacity of the multicast downlink and the sidelink, respectively;constraint (1) ensures that the quality level of tiles sent byBS is no less than that transmitted by the sidelink sender;constraints (3) and (4) enforce that both the BS and sidelinksender allocate all its downlink and sidelink bandwidth forall the groups, respectively; and constraints (13), (14), (15)and (16) specify the smoothness of nearby tiles.

For Scenario JnD, the problem formulation is given by

max{zm,n},{|Kg|},{k∗g},{B

DLg },{B

SLg }

Pjnd =1

|G|∑g∈|G|

∑(m,n)∈Φg

Υjnd(m,n)

(24a)

s.t. b∑

(m,n)∈Φg

zm,n ≤ Cjndg , ∀g ∈ |G| (24b)

(2), (3), (4), (17), (18), (24c)

where Cjndg in Problem (24a) is given by

Cjndg = min

{CDL

g , CDL(BDLg , jg) + CSL

g

}, (25)

which means that the overall data size of requested tiles for oneuser group should not exceed the minimum value of group’smulticast capacity and the sum of sidelink capacity and thedownlink capacity of the user with the worst wireless channel.

Both Problems (23a) and (24a) are Mixed-Integer NonlinearProgramming (MINLP) problems, which are hard to solve inpolynomial time. A general method for a suboptimal solution

Page 7: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 7

is to make a continuous relaxation of the discrete constraints.Specifically, constraints (1) and (2) can be relaxed as

xm,n ≥ ym,n, xm,n ∈ [1, L], ym,n ∈ [1, L], (m,n) ∈ Φ (26)zm,n ∈ [1, L], (m,n) ∈ Φ. (27)

Furthermore, k∗g and |Kg| are also discrete variables, whichare related to bandwidth allocation, and the selected tile qualitylevel xm,n, ym,n, and zm,n. It is hard to make a continuousrelaxation by common methods. Under this circumstance, anintuitive approach to solving the problems is exhaustive search,i.e., finding the maximum value corresponding to all thecombinations of k∗g and Kg , where g ∈ G. When k∗g and Kg arefixed, the original problems are reduced to a standard convexproblem and can be solved by many optimization tools.

Proposition 1. The computational complexity of the exhaus-tive search approach for Problem (23a) is O(|U||G|(Q2R2.5 +R3.5)), where Q = 2(MN + |G|) and R = 7MN + 2|G|.

Proof: When k∗g and |Kg| are fixed, Problem (23a) can besolved by popular optimization tools. We choose the SeDuMisolver for its low asymptotic computational complexity, whichis O(Q2R2.5 + R3.5), where Q and R are the numbersof decision variables and linear matrix inequalities (LMIs),respectively [42].

Then we can see that there are MN {xm,n}, MN {ym,n},|G| {BDL

g }, and |G| {BSLg } decision variables in Prob-

lem (23a). Thus we have Q = 2(MN + |G|). In addition,it is noted that (23b), (23c), (1), (3), (4), (13), (14), (15), and(16) are LMIs. Then we have R = 7MN + 2|G|.

For all the groups, Πg∈G(|Ug|−1) optimizations are neededto obtain the optimal solution, since there are |Ug|−1 possibleselections for k∗g and Kg in one group Ug . Considering that|Ug| ≤ |U|, the computational complexity of the exhaustivesearch approach is O(|U||G|(Q2R2.5 + R3.5)), where Q =2(MN + |G|) and R = 7MN + 2|G|.

Proposition 2. The computational complexity of the exhaus-tive search approach for Problem (24a) is O(|U||G|(Q2R2.5 +R3.5)), where Q = MN + 2|G| and R = 3MN + 2|G|.

Proof: Similar to Proposition 1, there are MN {zm,n},|G| {BDL

g }, and |G| {BSLg } decision variables in Prob-

lem (24a). Thus we have Q = MN + 2|G|. In addition, itis noted that (24b), (3), (4), (17), and (18) are LMIs, and wehave R = 3MN + 2|G|.

For all the groups, there are also |U||G| iterations for obtain-ing the optimal solution, and the computational complexityis O(|U||G|(Q2R2.5 + R3.5)), where Q = MN + 2|G| andR = 3MN + 2|G|.

V. TWO-STAGE OPTIMIZATION ALGORITHM

According to Propositions 1 and 2, the exhaustive searchapproach could be extremely time consuming and the optimalsolution cannot be found in polynomial time as the numberof users and the number of groups |G| increase. Therefore, inthis section, a two-stage optimization algorithm with muchlower computational complexity is proposed to compute asuboptimal solution. Since Problems (23a) and (24a) have

the same structure, we will focus on the design of the two-stage optimization algorithm for Problem (23a) for brevity.The algorithm for solving Problem (24a) has a similar design.

Specifically in Stage I, we fix the bandwidth allocation tofind a quasi-optimal solution k∗g and |Kg| with a sidelink re-ceivers and sender selecting algorithm. At Stage II, we executea bandwidth allocation and tile quality level selection schemebased on the k∗g and |Kg| solved in Stage I. Considering thatthe change of allocated bandwidth in each group may causesome changes on the multicast downlink rate and sidelink rate,and consequently, lead to changes on the value of k∗g and |Kg|,the solution procedure will iterate over these two stages untilthe result is convergent. The details of the two-stage approachare given in the following.

A. Stage I: Sidelink Receivers and Sender Selection

The goal of Stage I is to determine the specific sidelinksender k∗g and the number of users |Kg| for each user group g.Accordingly, the user with the worst channel condition in Kg

and Jg , which is denoted as kg,min and jg,min, respectively,can be identified. By virtually allocating fixed amounts ofbandwidth BDL

fixed and BSLfixed to each group, the original

problem can be decoupled into |G| independent problems. Thedecoupled problem of group g can be formulated as

max{xm,n},{ym,n},{|Kg|},{k∗g}

Υ†ind(g) =∑

(m,n)∈Φg

Υind(m,n)

(28a)

s.t. b∑

(m,n)∈Φg

xm,n ≤ CDL†g (28b)

b∑

(m,n)∈Φg

ym,n ≤ CSL†g (28c)

(13), (14), (15), (16), (26), (28d)

where CDL†g = minkg∈Kg

CDL(BDL

fixed , kg)

and CSL†g =

maxkg∈Kg

minjg∈Jg

CSL(BSL

fixed, kg, jg

).

The sidelink receivers and sender selection algorithm issummarized in Algorithm 1. Firstly, the users in group g aresorted according to their wireless channel conditions in thedescending order, and the information of users are re-storedin Ug . Secondly, we choose the top |Kg| users in Ug to computeCDL†

g and CSL†g , and the objective function (28a) is solved by

a standard optimization solver afterwards. To find the optimalobject value for (28a), |Kg| is changed from |Ug| − 1 to 1 bydetaching one user with the worst channel condition at a time.Thirdly, find k∗g and |Kg| according to the maximum valueamong the |Ug| − 1 results.

B. Stage II: Bandwidth Allocation and Tile Quality LevelSelection

The goal of Stage II is to properly allocate bandwidthand select the tile quality level. Based on the results of

Page 8: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 8

Algorithm 1 Sidelink Receivers and Sender Selection Algo-rithm (SRSSA)Input: Ug , |Ug|, BDL

fixed, and BSLfixed;

Output: k∗g , |Kg| (namely, kg,min), and jg,min;1: Sort Ug in descending order according to channel condi-

tion;2: for 1 ≤ |Jg| ≤ |Ug| − 1 do3: jg is the index of the user with the |Jg| worst channel

condition;4: for kg ∈ Kg do5: Compute CDL

(BDL

fixed , kg);

6: Compute CSL(BSL

fixed, kg, jg

);

7: end for8: Find min

kg∈Kg

CDL(BDL

fixed , kg);

9: Find maxkg∈Kg

CSL(BSL

fixed, kg, jg

);

10: Compute temporary Υ†ind(g) by a standard optimiza-tion solver;

11: end for12: Obtain k∗g , |Kg|, kg,min, and jg,min according to the

maximum value among all the Υ†ind(g)s;

Algorithm 2 Bandwidth Allocating Algorithm (BAA)Input: k∗g , |Kg|, kg,min and jg,min;Output: {x∗m,n} ,{y∗m,n}, {BDL∗

g } , and {BSL∗g };

1: Solve Problem (31a) with a convex optimization solver;

Algorithm 1, we rewrite constraints (23b) and (23c) as

b∑

(m,n)∈Φg

xm,n ≤ CDL(BDLg , kg,min) (29)

b∑

(m,n)∈Φg

ym,n ≤ CSL(BSLg , k∗g , jg,min), (30)

where k∗g , kg,min and jg,min are imported from Algorithm 1.Problem (23a) is then reformulated as follows.

max{x∗m,n},{y

∗m,n},

{BDL∗g },{BSL∗

g }

Υ‡ind =1

|G|∑g∈|G|

∑(m,n)∈Φg

Υind(m,n) (31a)

s.t. (3), (4), (13), (14), (15), (16), (29), (30), (26). (31b)

All the constraints in (31b) are convex, and the objectivefunction (31a) is concave. Thus Problem (31a) is a concaveoptimization, and can be solved effectively using a standardconvex optimization solver. The bandwidth allocation algo-rithm is presented in Algorithm 2.

C. Two-stage Iteration

Considering that a different bandwidth allocation for eachgroup usually leads to change of k∗g , |Kg| (namely, kg,min),and jg,min, we take multi-round iterations over the above twostages to obtain the optimal solution. The optimal solution canbe achieved after, at most, |G| · maxg∈|G||Ug| iterations. Theiterative procedure is shown in Algorithm 3.

Note that we have relaxed the discrete constraint (1)

Algorithm 3 Two-Stage Iteration Optimization AlgorithmInput: ε;Output: BDL∗

g , BSL∗g , x∗m,n, y∗m,n, k∗g , |Kg|, and Υ∗ind;

1: Set ` = 1, `max = |G| ×maxg∈|G||Ug|, and Υ∗ind = 0;2: repeat3: if ` == 1 then4: Obtain k∗g(`) and |Kg(`)| by Algorithm 1 withBDL

fixed and BSLfixed;

5: else6: Obtain k∗g(`) and |Kg(`)| by Algorithm 1 withBDL∗

g (`), BSL∗g (`);

7: end if8: Compute BDL∗

g (`), BSL∗g (`), x∗m,n(`), y∗m,n(`), and

Υ∗ind(`) by Algorithm 2 with k∗g(`) and |Kg(`)|;9: if Υ∗ind < Υ∗ind(`) then

10: Υ∗ind = Υ∗ind(`);11: BDL∗

g = BDL∗g (`) and BSL∗

g = BSL∗g (`);

12: k∗g = k∗g(`) and |Kg| = |Kg(`)|;13: x∗m,n = x∗m,n(`);14: y∗m,n = y∗m,n(`);15: end if16: ` = `+ 1;17: until ` ≥ `max

18: x∗m,n = bx∗m,nc;19: y∗m,n = by∗m,nc;

to a continuous constraint (26). Thus the solution afteriterations may not be feasible for Problem (23a). SinceΥ∗ind > Υind, we round down x∗m,n and y∗m,n to bx∗m,ncand by∗m,nc, where bx∗m,nc and by∗m,nc denote the greatestinteger less than or equal to x∗m,n and y∗m,n. Finally, the sub-optimal solution to Problem (23a) is denoted by {bx∗m,nc,by∗m,nc, k∗g , |Kg|, BDL∗

g , BSL∗g }. We have the following propo-

sition on computational complexity of the procedure.

Proposition 3. The computational complexity of the two-stagealgorithm for Scenario InD is O(|U|3−|U|+|G|2U2(Q2R2.5+R3.5)), where Q = 2G(MN + 2) and R = MN(2G+ 5).

Proof: There are two parts of computational complexityin the two-stage algorithm.

(i) The computational complexity of CDL†g and CSL†

g : Forall the groups, CDL†

g requires∑

g∈G |Ug|(|Ug| − 1)/2 itera-tions, since |Kg| starts from (|Ug|−1) to 1 in (|Ug|−1) loops.Considering that |Ug| ≤ |U|, the computational complexity ofCDL†

g is upper bounded by O(|U|(|U| − 1)).Computing CSL†

g takes∑|Ug|−1

|Kg|=1 |Kg|(|Ug| − |Kg|) itera-tions, which is (|Ug|3 − |Ug|)/6. Considering that |Ug| ≤ |U|,the computational complexity of CSL†

g is in the order ofO((|U|3 − |U|)).

The CDL†g and CSL†

g are computed jointly in Algorithm 1,and the computational complexity should be O((|U|3 − |U|)).

(ii) The computational complexity of solving the convexproblem: This is the computational complexity for findingthe optimal values for xm,n, ym,n, BDL

g , and BSLg . An

optimization solver is to be called∑

g∈|G| (|Ug| − 1) + 1times, and there are at most |G| · maxg∈|G||Ug| iterations. In

Page 9: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 9

Problem (23a), we have Q = 2G(MN +2) decision variablesand R = MN(2G+ 5) linear matrix inequalities (LMIs).

Therefore, the computational complexity of the two-stagealgorithm is O(|U|3 − |U|+ |G|2U2(Q2R2.5 +R3.5)).

Proposition 4. The space complexity of the two-stage algo-rithm for Scenario InD is O(|U|).

Proof: It requires at most |U| units of storage for com-puting CDL†

g and CSL†g , and the variables of the two-stage

algorithm are linearly proportional to the number of usersand the number of tiles in each segment. Therefore, the spacecomplexity of the two-stage algorithm is O(|U|).

D. Two-Stage Algorithm for Scenario JnD

Similar to Scenario InD, the two-stage algorithm can beutilized to solve Problem (24a) as well with the followingslight adaptations.• Compute CDL

(BDL

fixed , jg,min

).

• Compute Cjndg .

• The tile quality received by multicast receivers andsidelink receivers in the same group is consistent.

Proposition 5. The computational complexity of the two-stagealgorithm for Scenario JnD isO(|U|3−|U|+|G|2U2(Q2R2.5+R3.5)), where Q = |G|(MN + 4) and R = MN(|G|+ 2).

Proof: Similar to the two-stage algorithm for ScenarioInD, there are two parts of computational complexity in thetwo-stage algorithm for Scenario JnD.

The computational complexity of Cjndg in Stage I is

O((|U|3 − |U|)), since CDLg and CSL

g should be computed aswell in Scenario JnD.

There are Q = |G|(MN + 4) decision variables and R =MN(|G| + 2) linear matrix inequalities (LMIs) for solvingProblem (24a).

Therefore, the computational complexity of the two-stagealgorithm for Scenario InD is O(|U|3−|U|+|G|2U2(Q2R2.5+R3.5)).

VI. SIMULATION EVALUATION AND DISCUSSIONS

In this section, the performance of the proposed algorithmsfor the two scenarios are evaluated with simulations, respec-tively. The parameters used in our simulations are presentedin Table II. Specifically, we consider one cell with a coveragearea of 250×250m2. Fifty users are randomly distributed inthe cell area, at distances larger than 100m from the BSand no more than 6 m from each other, respectively. Similarto the standard LTE parapemers [43], the carrier frequencyof BS-user downlink and sidelink are 2.6GHz and 2.4GHz,respectively. We consider communications with 20MHz band-width for both the downlink and the sidelink. The log-normalshadowing has an 8dB standard deviation and the Rayleighfading follows a complex Gaussian process with an 1dBstandard deviation. The one-sided power spectral density ofwhite Gaussian noise is N0 = −134dBm/MHz. The videolibrary hosts 100 360◦ videos, each with 1000 frames. Each360◦ video frame contains M × N = 8 × 10 tiles, and each

TABLE IIDEFAULT SIMULATED PARAMETERS

Parameters Values

Cell area 250 m × 250 mDistance between BS and Users ≥ 100 mDistance among users [2, 65] mNumber of users 50Bandwidth of BS downlink 20 MHzBandwidth of sidelink 20 MHzN0 -134 dBm/MHzShadow fading Log-normal,

mean = 0, sigma=8Rayleigh fast fading complex Gaussian,

mean = 0, sigma=1Carrier Frequency of BS downlink 2.6 GHzCarrier Frequency of sidelink 2.4 GHzBS transmit power 46 dBmUser equipment transmit power 23 dBmAntenna gain User: -1 dBi;

BS: 16 dBiAntenna configure for BS SISO (1Tx 1Rx)Antennas for User 1 for downlink;

1 for sidelinkTiles per 360◦ video frame 80Frames per video 1000Number of videos 100Tile quality smoothness ∆ 1Number of user groups [10, 50]

user requires 40 tiles uniformly. There are L = 10 levels ofquality for each tile and b = 8.408× 105 [16].

The proposed algorithms for solving the problems of Sce-nario InD and Scenario JnD are termed by IndependentDecoding Algorithm (InDec) and Joint Decoding Algorithm(JnDec), respectively. The following baseline schemes are alsosimulated for comparison purpose.

1) The optimal solutions (Exhaustive-InD and Exhaustive-JnD), which are the upper bound of the proposed algo-rithms and are obtained by exhaustive search.

2) The equal-bandwidth allocation algorithm with sidelinktransmissions (Eq-SL).

3) The equal-bandwidth allocation without sidelink trans-missions (Eq-NoSL).

4) Bandwidth allocation optimization algorithm withoutsidelink transmissions (Ba-NoSL), which considers mul-ticast transmissions only. The Ba-NoSL algorithm is asimplified version of the Algorithm 1 proposed in [12],but the power is not optimized for a consistent perfor-mance comparison.

5) Random downlink (and sidelink) bandwidth allocation,which is termed “Random.”

The equal-resource allocation schemes (i.e., Eq-SL and Eq-NoSL) allocate the same amount of bandwidth resource toeach multicast downlink (and sidelink) transmission. In eachsimulation, 100 random independent channel realizations aregenerated using Matlab and the Sedumi tool box.

A. Complexity of the Algorithms

According to Propositions 1, 2, 3, and 5, the proposedalgorithms have a much lower complexity than that of the ex-

Page 10: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 10

TABLE IIIEXECUTION TIMES AND TOTAL UTILITIES OF THE FOUR PROPOSED

ALGORITHMS WITH BDL = BSL = 20 MHZ

Algorithms Computation time (s) Total utilityStage I Stage II

Exhaustive-InD 2,002,780.16 130Exhaustive-JnD 1,405,091.84 149

Algorithm InDec 17.01 1.91 122Algorithm JnDec 15.08 1.34 144

1 2 3 4 5 6 7 8 9 10

iterations

115

120

125

130

135

140

145

150

To

tal U

tilitie

s

Exhaustive-InD @ 20 MHz

Exhaustive-JnD @ 20 MHz

InDec @ 20 MHz

JnDec @ 20 MHz

Fig. 4. Convergence of the proposed algorithms with 20MHz downlink and20MHz sidelink bandwidth.

haustive search, and Algorithm JnDec has a lower complexitythan that of Algorithm InDec. In this simulation, we set afixed number of groups (i.e. |G| = 10) and fixed number ofgroup users (i.e. |Ug| = 5, g ∈ |G|) to verify the complexity ofthe algorithms. We find the simulation results are consistentwith the propositions. As shown in Table III, the proposedalgorithms save more than 20,000 times of execution timethan the exhaustive search based algorithms. We also find thatAlgorithm InDec is more time consuming while it has a similarutility performance as Algorithm JnDec. However, there arealso disadvantages for JnDec, since it requires users bufferthe received signals, which increases the receiver complexity.

B. Algorithm Convergence Performance

Fig. 4 shows the convergence of the two proposed algo-rithms with 20MHz multicast and 20MHz sidelink bandwidth.We can see that the sub-optimal solutions of both AlgorithmInDec and Algorithm JnDec are obtained after at most fouriterations. Both proposed iterative algorithms converge veryfast, and the total utilities they achieve are also close to theoptimal solutions produced by the time-consuming exhaustivesearch based algorithms, respectively.

2 2.5 3 3.5 4

Download Bandwidth (@ 20MHz Sidelink Bandwidth) 107

40

60

80

100

120

140

160

180

Tota

l U

tilit

ies

Exhaustive-JnD

JnDec

Exhaustive-InD

InDec

Eq-SL

Ba-NoSL

Eq-NoSL

Random

(a) Total utilities with different BDL.

2 2.5 3 3.5 4

Sidelink Bandwidth (@ 20MHz Download Bandwidth) 107

60

70

80

90

100

110

120

130

140

150

Tota

l U

tilit

ies

Exhaustive-JnD

JnDec

Exhaustive-InD

InDec

Eq-SL

Ba-NoSL

Eq-NoSL

Random

(b) Total utilities with different BSL.

Fig. 5. Total utilities achieved by the algorithms with different bandwidths.

C. Total Utility versus Bandwidth

Fig. 5 demonstrates the total utilities achieved by theproposed algorithms (InDec and JnDec), Exhaustive-InD algo-rithm, Exhaustive-JnD algorithm, Eq-SL algorithm, Eq-NoSLalgorithm, Ba-NoSL algorithm, and Random algorithm withrespect to BS downlink bandwidth BDL and sidelink band-width BSL, respectively. From Fig. 5(a), we can see that thetotal utilities of all the algorithms (except the Random algo-rithm) increase with BDL, while the two proposed algorithmsalways outperform the other baseline algorithms and achievenear optimal performance (as their total utilities are close tothe corresponding upper bounds, i.e., the utilities achieved byExhaustive-InD and Exhaustive-JnD, respectively). These re-sults demonstrate the effectiveness of the proposed suboptimalsolutions. It is shown that the proposed algorithms achieve12%, 32%, and 46% performance gain over Ba-NoSL, Eq-SL, and Eq-NoSL, respectively. For the Average bandwidthallocation algorithms, the utility performance of Algorithm Eq-SL is still 11% higher than that of Algorithm Eq-NoSL, which

Page 11: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 11

2 2.5 3 3.5 4

Download Bandwidth (@ 20MHz Sidelink Bandwidth) 107

110

120

130

140

150

160

170T

ota

l U

tilit

ies

JnDec

JnDec(no smooth)

InDec

InDec(no smooth)

Fig. 6. Total utilities performance of the proposed algorithms with respectto tile quality smoothness.

demonstrate the efficacy of sidelink transmissions.As shown in Fig. 5(b), our proposed algorithms also out-

perform the other baseline algorithms with increased sidelinkbandwidth. It is worth noting that the performance of JnDecno longer grows when the sidelink bandwidth exceeds 35MHz.The reason is that the video tile quality is limited by Cjnd

g

defined in (25), which is the minimum value of CDLg and

CDL(BDLg , jg) + CSL

g . As CSLg increases as more bandwidth

for the sidelink, Cjndg will be equal to CDL

g (a constantvalue, fixed 20MHz downlink bandwidth), and the total utilitywill become constant. Furthermore, we can see that the totalutilities of all the algorithms rise slowly with the increase ofBSL, since the tile quality sent by the sidelink should not behigher than that of the BS multicast downlink.

D. Total Utility Versus Smoothness

Obviously, the results of problems (23a) and (24a) will bedifferent if the tile quality smoothness constraints, i.e., (13)–(18), are not considered. However, the total utilities can still beobtained with the proposed two-stage optimization algorithms.The new algorithms without considering tile quality smooth-ness in Scenario JnD and InD are termed JnDec (no smooth)and InDec (no smooth), respectively. Fig. 6 and Fig. 7 illustratethe impact of tile quality smoothness on the total utility andthe average tile quality performance. We can see in Fig. 6that the total utilities of the JnDec (no smooth) and InDec (nosmooth) algorithms are lower than that of the JnDec and InDecalgorithms, respectively. In Fig. 7, the tile quality levels of thestandard JnDec and InDec algorithms are much smoother thanthat of JnDec (no smooth) and InDec (no smooth). The latterconsists of many high peaks and deep valleys, which greatlydegrade the user QoE.

E. Tile Quality Level Performance

As shown in Fig. 8, we present the average tile quality levelperformance among different algorithms at 20MHz downlinkand 20MHz sidelink bandwidth in the color block charts. The

chart with 8×10 blocks represents the video frame, and eachblock with different colors stands for one tile. The average tilequality level increases as the block color changes from blue toyellow. For example, the average tile quality level of a yellowblock is higher than those of blue blocks. We can see that thereare more block in yellow colors when the InDec and JnDecalgorithms are used, meaning that among all the comparedalgorithms, the proposed algorithms have the ability to selectthe highest quality representations for users, which effectivelyincreases the user QoE. Note that we do not show the averagetile quality level performance of the Random algorithm, whichis poor due to the stochastic behavior in different simulations.

Fig. 9 verifies the utility performance with different numbersof user groups, while the multicast downlink and the sidelinkbandwidths are both fixed at 20MHz. We can see that thetile quality level decreases when the number of groups isincreased from 10 to 50, since less bandwidth can be allocatedto each group as there are more groups. However, the proposedalgorithms still outperform the baseline algorithms, whichmeans that their tile quality levels are much higher than thoseof the baseline algorithms.

VII. CONCLUSION

In this paper, a sidelink-aided multicast system for multi-quality tiled 360◦ VR video was proposed to achieve high userexperience under the constraints of bandwidth resource and tilequality smoothness in overloaded situations. We formulatedoptimization problems that took into account the relationshipamong tile quality level selection, sidelink sender selection,and bandwidth allocation to maximize the total utility of allusers for the independent decode scenario and the joint decodescenario, respectively. Two-stage iterative algorithms were pro-posed to obtain sub-optimal solutions with low computationalcomplexity. The simulation results demonstrated the manyadvantages of the proposed algorithm by exploiting sidelinktransmissions for tiled 360◦ VR video multicast.

REFERENCES

[1] H. Bellini, “The real deal with virtual and augmented reality,”Feb. 2016. [Online]. Available: http://www.goldmansachs.com/our-thinking/pages/virtual-and-augmented-reality.html

[2] L. Graham, “Citi eyes a trillion-dollar industryin virtual reality technology,” Oct. 2016. [Online].Available: http://www.cnbc.com/2016/10/14/citi-eyesa-trillion-dollar-industry-in-virtual-reality-technology.html

[3] J. Dai, Z. Zhang, S. Mao, and D. Liu, “A view synthesis-based 360◦VR caching system over MEC-enabled C-RAN,” IEEE Trans. CircuitsSyst. Video Technol., vol. 30, no. 10, pp. 3843–3855, Oct. 2020.

[4] T. Zhang and S. Mao, “Joint power and channel resource optimizationin soft multi-view video delivery,” IEEE Access J., vol. 7, no. 1,pp.148 084–148 097, Dec. 2019.

[5] Huawei, Inc., “White paper: Hosted network require-ments oriented on VR service,” 2016. [Online]. Available:http://www.imxdata.com/archives/17346

[6] M. Hosseini and V. Swaminathan, “Adaptive 360 VR video streaming:Divide and conquer,” in Proc. IEEE ISM’16, San Jose, CA, Dec. 2016,pp. 107–110.

[7] M. Graf, C. Timmerer, and C. Mueller, “Towards bandwidth efficientadaptive streaming of omnidirectional video over http: Design, imple-mentation, and evaluation,” in Proc. ACM MMSys’17, Taipei, Taiwan,June 2017, pp. 261–271.

Page 12: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 12

Fig. 7. Average tile quality level performance of the proposed algorithms with respect to tile quality smoothness.

InDec

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10JnDec

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10Eq-SL

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10Ba-NoSL

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10Eq-NoSL

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10

Fig. 8. Average tile quality performance of different algorithms when BDL = BSL = 20MHz.

InDec @ Groups = 10

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10InDec @ Groups = 20

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10InDec @ Groups = 30

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10InDec @ Groups = 40

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10InDec @ Groups = 50

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10

(a) Average tile quality performance of InDecJnDec @ Groups = 10

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10JnDec @ Groups = 20

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10JnDec @ Groups = 30

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10JnDec @ Groups = 40

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10JnDec @ Groups = 50

2 4 6 8 10

column

2

4

6

8

row

2

4

6

8

10

(b) Average tile quality performance of JnDec

Fig. 9. Average tile quality performance of the proposed algorithm with respect to different numbers of user groups when BDL = BSL = 20MHz.

[8] A. Zare, A. Aminlou, M. M. Hannuksela, and M. Gabbouj, “HEVC-compliant tile-based streaming of panoramic video for virtual realityapplications,” in Proc. ACM Int. Conf. Multimedia 2016, Amsterdam,The Netherlands, Oct 2016, pp. 601–605.

[9] R. Skupin, Y. Sanchez, D. Podborski, C. Hellge, and T. Schierl,“HEVC tile based streaming to head mounted displays,” in Proc. IEEECCNC’17, Las Vegas, NV, Jan. 2017, pp. 613–615.

[10] D. V. Nguyen, H. T. T. Tran, A. T. Pham, and T. C. Thang, “An optimaltile-based approach for viewport-adaptive 360-degree video streaming,”IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 9, no. 1, pp. 29–42, Mar.2019.

[11] Y. Bao, T. Zhang, A. Pande, H. Wu, and X. Liu, “Motion-prediction-

based multicast for 360-degree video transmissions,” in Proc. IEEESECON’17, San Diego, June 2017, pp. 1–9.

[12] C. Guo, Y. Cui, and Z. Liu, “Optimal multicast of tiled 360 VR video inOFDMA systems,” IEEE Commun. Lett., vol. 22, no. 12, pp. 2563–2566,Dec. 2018.

[13] C. Guo, Y. Cui, and Z. Liu, “Optimal multicast of tiled 360 VR video,”IEEE Wireless Commun. Lett., vol. 8, no. 1, pp. 145–148, Feb. 2019.

[14] H. Ahmadi, O. Eltobgy, and M. Hefeeda, “Adaptive multicast streamingof virtual reality content to mobile users,” in Proc. Thematic Workshopsof ACM MM 2017, Mountain View, CA, Oct. 2017, pp. 170–178.

[15] W. Xu, Y. Wei, Y. Cui, and Z. Liu, “Energy-efficient multi-view videotransmission with view synthesis-enabled multicast,” in 2018 IEEE

Page 13: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 13

Global Communications Conference (GLOBECOM). IEEE, 2018, pp.1–7.

[16] K. Long, C. Ye, Y. Cui, and Z. Liu, “Optimal multi-quality multicast for360 virtual reality video,” in Proc. IEEE GLOBECOM’18, Abu Dhabi,UAE, Dec. 2018, pp. 1–6.

[17] J. Park, J. Hwang, and H. Wei, “Cross-layer optimization for VR videomulticast systems,” in Proc. IEEE GLOBECOM’18, Abu Dhabi, UAE,Dec. 2018, pp. 1–6.

[18] N. Jindal and Z. Luo, “Capacity limits of multiple antenna multicast,”in Proc. IEEE ISIT’06, Seattle, WA, July 2006, pp. 1841–1845.

[19] N. D. Sidiropoulos, T. N. Davidson, and Zhi-Quan Luo, “Transmit beam-forming for physical-layer multicasting,” IEEE Trans. Signal Process.,vol. 54, no. 6, pp. 2239–2251, June 2006.

[20] F. Hou, L. X. Cai, P.-H. Ho, X. Shen, and J. Zhang, “A cooperativemulticast scheduling scheme for multimedia services in IEEE 802.16networks,” IEEE Trans. Wireless Commun., vol. 8, no. 3, pp. 1508–1519, 2009.

[21] J. Seppala, T. Kosksela, T. Chen, and S. Hakola, “Network controlleddevice-to-device (D2D) and cluster multicast concept for LTE and LTE-A networks,” in Proc. IEEE WCNC’11, Cancun, Mexico, Mar. 2011,pp. 986–991.

[22] C. Yin, Y. Wang, W. Lin, and J. Xu, “Device-to-device assisted two-stage cooperative multicast with optimal resource utilization,” in Proc.IEEE Globecom Workshops 2014, Austin, TX, Dec. 2014, pp. 839–844.

[23] F. Rebecchi, L. Valerio, R. Bruno, V. Conan, M. D. de Amorim, andA. Passarella, “A joint multicast/D2D learning-based approach to LTEtraffic offloading,” Elsevier Computer. Commun., vol. 72, pp. 26–37,Dec. 2015.

[24] G. Araniti, A. Orsino, L. Militano, G. Putrino, S. Andreev, Y. Kouch-eryavy, and A. Iera, “Novel D2D-based relaying method for multicastservices over 3GPP LTE-A systems,” in Proc. 2017 IEEE Int. Symp.Broadband Multimedia Syst. Broadcast., Cagliari, Italy, Mar. 2017, pp.1–5.

[25] T. V. Santana, R. Combes, and M. Kobayashi, “Device-to-device aidedmulticasting,” in Proc. IEEE ISIT’18, Vail, CO, June 2018, pp. 771–775.

[26] P. Mursia, I. Atzeni, D. Gesbert, and M. Kobayashi, “D2d-aided multi-antenna multicasting,” in ICC 2019 - 2019 IEEE International Confer-ence on Communications (ICC), 2019, pp. 1–6.

[27] V. Ntranos, N. D. Sidiropoulos, and L. Tassiulas, “On multicast beam-forming for minimum outage,” IEEE Trans. Wireless Commun., vol. 8,no. 6, pp. 3172–3181, June 2009.

[28] O. Mehanna, N. D. Sidiropoulos, and G. B. Giannakis, “Joint multicastbeamforming and antenna selection,” IEEE Trans. Signal Process.,vol. 61, no. 10, pp. 2660–2674, May 2013.

[29] D. Lecompte and F. Gabin, “Evolved multimedia broadcast/multicastservice (eMBMS) in LTE-advanced: Overview and Rel-11 enhance-ments,” IEEE Commun. Mag., vol. 50, no. 11, pp. 68–74, Nov. 2012.

[30] O. Eltobgy, “VRCast: Mobile streaming of live 360-degree videos,”Ph.D. dissertation, School of Computing Science, Simon Fraser Uni-versity, Dec. 2018.

[31] A. Khisti, U. Erez, and G. W. Wornell, “Fundamental limits and scalingbehavior of cooperative multicasting in wireless networks,” IEEE Trans.Inf. Theory, vol. 52, no. 6, pp. 2762–2770, June 2006.

[32] B. Sirkeci-Mergen and M. C. Gastpar, “On the broadcast capacity ofwireless networks with cooperative relays,” IEEE Trans. Inf. Theory,vol. 56, no. 8, pp. 3847–3861, Aug. 2010.

[33] Y. Chen, J. Zhu, T. A. Khan, and B. Kasikci, “CPU microarchitecturalperformance characterization of cloud video transcoding,” in Proc. 2020IEEE Int. Symp. Workload Characterization (IISWC), 2020, pp. 72–82.

[34] Y. Kim, J. Huh, and J. Jeong, “Distributed video transcoding systemfor 8K 360◦ VR tiled streaming service,” in Proc. 2018 Int. Conf.Information Commun. Technol. Convergence (ICTC), 2018, pp. 592–595.

[35] Y. Ren, F. Liu, Z. Liu, C. Wang, and Y. Ji, “Power control in D2D-basedvehicular communication networks,” IEEE Trans. Veh. Technol., vol. 64,no. 12, pp. 5547–5562, 2015.

[36] W. Huang, L. Ding, G. Zhai, X. Min, J.-N. Hwang, Y. Xu, and W. Zhang,“Utility-oriented resource allocation for 360-degree video transmissionover heterogeneous networks,” Elsevier Digit. Signal Process., vol. 84,no. 1, pp. 1–14, Jan. 2019.

[37] V. Sitzmann, A. Serrano, A. Pavel, M. Agrawala, D. Gutierrez, B. Masia,and G. Wetzstein, “Saliency in VR: How do people explore virtualenvironments?” IEEE Trans. Vis. Comput. Graphics, vol. 24, no. 4, pp.1633–1642, Apr. 2018.

[38] R. Monroy, S. Lutz, T. Chalasani, and A. Smolic, “Salnet360: Saliencymaps for omni-directional images with CNN,” Elsevier Signal Process.:Image Commun., vol. 69, pp. 26–34, Nov. 2018.

[39] W. Zhang, Y. Wen, Z. Chen, and A. Khisti, “QoE-driven cache man-agement for HTTP adaptive bit rate streaming over wireless networks,”IEEE Trans. Multimedia, vol. 15, no. 6, pp. 1431–1445, Oct. 2013.

[40] J. Park, J.-N. Hwang, Q. Li, Y. Xu, and W. Huang, “Optimal DASH-multicasting over LTE,” IEEE Trans. Veh. Technol., vol. 67, no. 5, pp.4487–4500, May 2018.

[41] J. Yang, J. Luo, D. Meng, and J.-N. Hwang, “QoE-driven resourceallocation optimized for uplink delivery of delay-sensitive VR videoover cellular network,” IEEE Access, vol. 7, pp. 60 672–60 683, May2019.

[42] D. Peaucelle, D. Henrion, Y. Labit, and K. Taitz, “User’s guidefor SeDuMi interface 1.04,” Sept. 2002. [Online]. Available:http://homepages.laas.fr/peaucell/software/sdmguide.pdf

[43] A. Awada, E. Lang, O. Renner, K. Friederichs, S. Petersen, K. Pfaffinger,B. Lembke, and R. Brugger, “Field trial of LTE eMBMS network for TVdistribution: Experimental results and analysis,” IEEE Trans. Broadcast.,vol. 63, no. 2, pp. 321–337, June 2017.

Jianmei Dai received a B.S. degree in Commu-nication Engineering and an M.S. degree in Com-munication and Information Systems in 2004 and2007, respectively, and the Ph.D degree in Informa-tion and Communication Engineering from BeijingUniversity of Posts and Telecommunications, Bei-jing, in 2021. He was a visiting scholar at AuburnUniversity, AL, USA from Apr. 2019 to Apr. 2020.He is currently a lecturer at SEU, Beijing. Hisresearch interests include optimization theory andits applications in wireless video transmission and

wireless networks.

Guosen Yue [S’03-M’04-SM’09] received the B.S.degree in physics and the M.S. degree in electri-cal engineering from Nanjing University, Nanjing,China, in 1994 and 1997, respectively, and the Ph.D.degree in electrical engineering from Texas A&MUniversity, College Station, TX, USA, in 2004. Hewas a Senior Research Staff at NEC LaboratoriesAmerica, Princeton, NJ, USA, where he conductedresearch on broadband wireless systems and mobilenetworks. From 2013 to 2015, he was with Broad-com Corporation, Matawan, NJ, USA, as a System

Design Scientist. He is now a Principal Research Staff at Futurewei Technolo-gies, Bridgewater, NJ, USA. His research interests are in the general areas ofwireless communications and signal processing. He has served as an AssociateEditor of the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONSand EURASIP Research Letters in Communications, the Guest Editor ofEURASIP Journal of Wireless Communication and Networking special issueon interference management, Elsevier’s Physical Communication special issueon signal processing and coding. He served as the Symposium Co-chair forIEEE GlobeCom 2019, IEEE ICC 2010, the Track Co-chair for IEEE ICCCN2008, the steering committee member for IEEE RWS 2009. He also served asthe director of IEEE MMTC Communication Frontier. He is a senior memberof the IEEE.

Page 14: IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH

IEEE INTERNET OF THINGS JOURNAL, VOL.XXX, NO.XXX, MONTH YEAR, DOI: 10.1109/JIOT.2021.3105100 14

Shiwen Mao [S’99-M’04-SM’09-F’19] received hisPh.D. in electrical engineering from PolytechnicUniversity, Brooklyn, NY in 2004. After joiningAuburn University, Auburn, AL in 2006, he held theMcWane Endowed Professorship from 2012 to 2015and the Samuel Ginn Endowed Professorship from2015 to 2020 in the Department of Electrical andComputer Engineering. Currently, he is a professorand Earle C. Williams Eminent Scholar Chair, andDirector of the Wireless Engineering Research andEducation Center at Auburn University. His research

interest includes wireless networks, multimedia communications, and smartgrid. He is an Associate Editor-in-Chief of IEEE/CIC China Communications,an Area Editor of IEEE Transactions on Wireless Communications, IEEEInternet of Things Journal, IEEE Open Journal of the CommunicationsSociety, and ACM GetMobile, and an Associate Editor of IEEE Transac-tions on Cognitive Communications and Networking, IEEE Transactions onNetwork Science and Engineering, IEEE Transactions on Mobile Computing,IEEE Network, IEEE Multimedia, and IEEE Networking Letters, amongothers. He is a Distinguished Lecturer of IEEE Communications Societyand a Distinguished Lecturer of IEEE Council of RFID. He received theIEEE ComSoc TC-CSR Distinguished Technical Achievement Award in2019 and NSF CAREER Award in 2010. He is a co-recipient of the 2021IEEE Communications Society Outstanding Paper Award, the IEEE VehicularTechnology Society 2020 Jack Neubauer Memorial Award, the IEEE ComSocMMTC 2018 Best Journal Award and 2017 Best Conference Paper Award, theBest Demo Award of IEEE SECON 2017, the Best Paper Awards from IEEEGLOBECOM 2019, 2016 & 2015, IEEE WCNC 2015, and IEEE ICC 2013,and the 2004 IEEE Communications Society Leonard G. Abraham Prize inthe Field of Communications Systems.

Danpu Liu received the Ph.D. degree in communi-cation and electrical systems from Beijing Universityof Posts and Telecommunications, Beijing, China in1998. She was a visiting scholar at City Universityof Hong Kong in 2002, University of Manchester in2005, and Georgia Institute of Technology in 2014.She is currently working at the Beijing Key Lab-oratory of Network System Architecture and Con-vergence, Beijing University of Posts and Telecom-munications, Beijing, China. Her research involvedMIMO, OFDM as well as broadband wireless access

systems. She has published over 100 papers and 3 teaching books, andsubmitted 26 patent applications. Her recent research interests include 60GHzmmWave communication, wireless high definition video transmission andwireless sensor network.