data-driven control of cellular networks · iot sensor updates. challenges of network control 1....
TRANSCRIPT
Data-Driven Control of Cellular Networks
Sandeep Chinchali, Marco Pavone, Sachin Katti
Stanford University
Forecaster Controller
Data-Driven Network Control is Ubiquitous
Robotic Taxi Fleets
Optimal Control
Cell Congestion
Video Streaming
IoT Sensor Updates
Challenges of Network Control
1. Data-driven forecasts• What features/statistics are
needed for control?
2. Many Input Variables• Forecaster and Controller
3. Increasingly: • Data boundaries
Cell Congestion
Controller
Network Operator
Control Reward
Forecasted Features
Action
Controller Private StateForecaster Private State
Joint (Public) State
(Control API)
General Approach
Forecaster Controller
Video QoE
Private: User mobility Private: Buffer State
Past Throughputs
Bitrate
Future Throughputs(Risk-adjusted, ~30s)
Forecaster Controller
Video Streaming
Network Operator
Cloud Video Services
Passenger Wait Time,Ride Efficiency
Future Cell Congestion,Anomalies (~ hrs)
Taxi Routes
Private: User Location, Cell Demand
Private: Taxi locations, Outstanding Demand
Forecaster Controller
Robotic Taxi Fleet City-wide Congestion (Google Maps)
Network Operator
Taxi Operator
Approach: Reinforcement Learning (RL)
Forecaster Controller
Reinforcement Learning (RL)
Goal: Maximize the total reward
Agent Environment
Observe state 𝑠𝑡
Action 𝑎𝑡
Reward 𝑟𝑡
8Adapted from Pensieve (Sigcomm 18, Mao et. al.)
Cellular Network Traffic Scheduling (AAAI 2018)
9
Internet
Delay Sensitive
Real-time Mobile Traffic
Internet of Things (IoT)
Delay Tolerant(Map/SW updates)
Cellular Network Traffic Scheduling using Deep RL (AAAI 2018)
Why is IoT traffic scheduling hard?
IoT
Congestion
Acceptable Limit IoT Contending goals
• Max IoT data
• Loss to mobile traffic
• Network limits
Optimal Control
RL Schedules Sensor Updates1. Network State Space (Cell congestion forecasts)2. IoT Scheduler Actions (Traffic Rate)3. Operator policies/reward: efficient use of network
11
Num Users
IoT Scheduler
Network state + forecasts
Congestion
Cell efficiency IoT rate
RL Dynamics: Live Network Experiments
Agent Environment
Action
Network state
Reward
RL exploits transient dips in utilization
Controlled Congestion Utilization gain
13
Transient Dip
Application 2: Mobile Video Streaming
ABR agent
state
Neural Network
240P
480P
720P
1080P
policy πθ(s, a)
Observe state s
parameter θ
Figure from Pensieve (Sigcomm 18, Mao et. al.)
How will forecasts of network conditions improve ABR?
Palo Alto Cell Throughput Diversity
Insight: Foresight of true network condition helps
Solution: Dynamically splice specialized controllers (metaRL)
Switch CellFreeway
High-LevelTrace
Statistics[μ, 𝝈](API)
VLO
LO
MID
HI
metaRL
Bitrate
metaRL controllerForecaster
VLO
LO
MID
HI
Infer network condition from forecast statsPrivate: User Location
QoE
Palo Alto (Our data) + FCC/Norway (Pensieve)
metaRL
Generalize to FCC/Norway data from Pensieve
metaRL
Re-analysis of Pensieve (Sigcomm 18, Mao et. al.)
Linear QoE (hi-thpt) HD QoE (vlo-thpt)
Optimize Tail
Future work
1. Broad-vision for Time-Series Control• Data-driven forecasts/ control
strategies• Intrinsic data boundaries
2. Value/Price of Information used for Long-Term Control?
3. Privacy/Information Leakage
Questions: [email protected]
Forecaster Controller
Extra slides
Claim: Decouple but co-design predictor and controller
Why not end-to-end learning?
Why Decouple?1. Natural Data Boundaries2. Modularity (Re-use forecaster)
Why Co-design?1. Tune forecasts to control risk2. Robust Adversarial Training
22
End-to-end
State Action
End-to-end
State Action
Control API
Reward
Forecaster Controller
Private Info Private Info
ControllerForecaster
RL Formulation
Adversarial
Quantifying Sub-optimality Gap
With oracle knowledge of network condition Have to learn network condition
Value/Price of Timeseries Variables?
Gap
IoT Traffic Scheduling (AAAI 2018)
Diverse city-wide cell patterns
IoT Scheduler
Network State
IoT rate