evolution of collaboration protocolsd2zmdbbm9feqrf.cloudfront.net/2016/anz/pdf/brkcol-2778.pdf ·...
TRANSCRIPT
Evolution of Collaboration Protocols
Adrian Wang, TME, CTG
IEEE, IETF, H.460, ITU, ETSI,
TIPv7 ISO, XMPP…BGP, MPLS,
SCTP, MPEG4 over SCTP, IEEE
802.1 H.323, SIP, H.239, H.460,
TIPv6, TIPv7, IEEE, TIPv7 ISO,
Cisco, The Real Collaboration Leader
for Standardisation and Evolution of Technologies
Agenda
Protocols for Media
Protocols for Controls
Protocols for Cloud
Opus
Opus is an audio codec for speech and music
Introduction & Background
AACLD
G722
G7221
G728G711
G729AB PCM16
Yet another audio codec?!?!
VORBIS
MP3
ITU-T definition (4 bands)
Signal Bandwidth & Sampling Rate Definitions
Abbreviation Meaning Pass-band Sampling Rate
NB Narrowband 300- 3 400Hz 8 kHz
WB Wideband 50- 7 000 Hz 16 kHz
SWB Superwideband 50- 14 000 Hz 32 kHz
FB Fullband 20- 20 000 Hz 48 kHz
• Flexible speech and audio codec
• Royalty-free
• Open source
• Standardised by the Internet Engineering Task Force (IETF) as RFC 6716 -(September 2012)
Introduction & Background
What does it do?
CE based endpoint: 32 bit x 48 kHz = 1536 kbps After Encoding: 48 kbps!!
Opus Characteristics
• Bit rate: 6 to 510kbps
• Sampling Rate: 8 to 48kHz
• CBR and VBR
• Narrowband to Fullband
• Mono/Stereo/Multichannel
• Frame Size: 2.5 to 60ms
• Frames are either Mono and Stereo
• Variable Complexity
Adaptive Bitrate vs Variable Bit Rate
• Adaptive Bit Rate change bitrates during calls based on network conditions
• Variable Bit Rate change bitrates during calls based on the amount of audio stream information (e.g. speech vs silence)
• Both can be used at the same time
1. SILK
• Developed by Skype
• Based on Linear Prediction
• Efficient for Voice
• Up to 8 kHz audio bandwidth
2. CELT
• Developed by Xiph.Org
• Based on MDCT
• Good for universal audio/music
Merging Two Codecs
Opus Three Operating Modes
Mode Codec Typical Bandwidth Application Similar
Principle as
LP Modified SILK Wideband Voice G729/iLBC
MDCT CELT Fullband Music AAC-LD
Hybrid SILK+ CELT Fullband Voice+Music Mix
Opus Internet Robustness
• Inband Forward Error Correction (FEC) – LP and Hybrid Mode onlySending extra data (overhead) as a tool to rebuild lost packetsImportant Packets (algorithm) contains a re-encoded (with lower bitrate) packet of the previous packetAdds latency on the decoder side (have to wait for the packet following the lost one…)
• Discontinuous Transmission (DTX)Reduce packet rate during silenceWhen enabled, only one frame every 400 ms is encoded
• Packet Loss Concealment (PLC) Decoder side Fills in DTX blanks (Opus will “synthesise” missing audio based on previous packets)
Internet Robustness Mechanisms available inside the Codec
OPUS Competitive View
Delay and Rate Coverage
Source: http://www.opus-codec.org/comparison/
P.862.2 Wideband PESQ scores
20
25
30
35
40
45
50
55
60
65
70
4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5
Bit
rate
(kb
ps)
P.862.2 WB PESQ, ASTS n=10
Opus @ 25Kbps VBR
AAC-LD @ 64K CBR
G.722 @ 64K CBR
Comparison of Codec complexity
Cisco Video Endpoints & Opus
• Mandatory in WebRTC
• Mandatory in SPARK
• No Royalties
• Open Source (easy to implement without fuzz)
• Key Strategy to be Cloud connectedOne codec for the CloudInteroperability without transcoding (avoid quality loss and delay)
Why is Opus important to Cisco Collab?
TC/CE Endpoint Opus Operating Mode
Mode Codec Typical Bandwidth Application Similar
Principle as
LP Modified SILK Wideband Voice G729/iLBC
MDCT CELT Fullband Music AAC-LD
Hybrid SILK+ CELT Fullband Voice+Music Mix
TC/CE Endpoints & Opus Characteristics
• Bit rate: 48 kbps or 24 kbps
• Sampling Rate: 48kHz
• VBR
• Mono
• Frame Size: 10ms
• Variable Complexity for different HW
• CSR 11.0:Available on Jabber, SX10, SX20, SX80, MX200G2, MX300G2, MX700 & MX800
• Post-CSR 11.0:
• In multipoint, with TelePresence Server
• Expanded endpoint support with DX, 7800 & 8800 IP phones
Opus & Collaboration System Release 11 ValueExtended
Collaboration Desktop – Opus support
Target the release post 10.2.5 – CE sw
Synergy will NOT support Opus
Collaboration Rooms – Opus support
MX200 G2 – 42’’
SX10
MX300 G2 – 55’’
SX20
MX700 – 55’’ MX800 Dual – 70’’MX800 – 70’’
SX80
Supported From TC 7.1 + CE8 (MX700 Single/Dual + MX800 Single from TC 7.2 // MX800 Dual from TC 7.3.1)
1. AACLD
2. Opus
3. G722
4. G7221
5. G728
6. G711
7. AACLD (lower bitrate)
8. G729AB
9. G729
10. G729A
11. PCM16
CE Endpoint CapSet
• Opus is on by default but not the highest in the capset
• Command to remove AAC-LD such that Opus becomes the highest priority codec on CE endpoints:
xconfiguration Experimental CapsetFilter: "AAC-LD"
CE Endpoint CapSet
CUCM Codec Preference List
This is the new table for UCM 11.0
Planning to put Opus as top default
If Low Lossy is Configured
for Link Loss Type
If Lossy is Configured
for Link Loss Type
MP4A-LATM 128k OPUS
AAC-LD (MP4A Generic) MP4A-LATM 128k
MP4A-LATM 64k AAC-LD (MP4A Generic)
MP4A-LATM 56k MP4A-LATM 64k
L16 256k MP4A-LATM 56k
MP4A-LATM 48k L16 256k
OPUS MP4A-LATM 48k
G.722 64k ISAC 32k
ISAC 32k AMR-WB(7k-24k)
MP4A-LATM 32k MP4A-LATM 32k
AMR-WB(7k-24k) G.722 64k
G.722.1 32k G.722.1 32k
G.722 56k G.722 56k
G.722.1 24k G.722.1 24k
G.722 48k G.722 48k
MP4A-LATM 24 K MP4A-LATM 24 K
G.711 U-Law 64k G.711 U-Law 64k
G.711 A-Law 64k G.711 A-Law 64k
G.711 U-Law 56k G.711 U-Law 56k
G.711 A-Law 56k G.711 A-Law 56k
ILBC 16k ILBC 16k
G.728 16k G.728 16k
AMR-WB(5k-13k) AMR-WB(5k-13k)
GSM Enhanced Full Rate 13k GSM Enhanced Full Rate 13k
GSM Full Rate 13k GSM Full Rate 13k
G.729 8k G.729 8k
G.729a 8k G.729a 8k
G.729b 8k G.729b 8k
G.729ab 8k G.729ab 8k
GSM Half Rate 6k GSM Half Rate 6k
G.723.1 7k G.723.1 7k
Opus – Endpoint Registration Model
Registrar Opus supported SW Version
UCM 11.0
HCS/Huron [available at launch]
SPARK [available from day 1]
VCS/Expressway X8.6
CME (ISR routers)
Opus & Call Setup Protocol
Protocol TC/CE sw
Direct IP Calling
SIP
SPARK
H323
• Royalty free, Open Source Audio Codec from 2012
• Initiated to be a Single Framework for Speech and Music
• Combines the SILK and CELT codec
• Has three Operating Modes covering 5 bands:1) LP (SILK & Wideband)2) Hybrid (SILK + CELT)3) MDCT (CELT & Fullband)
Opus Summary
• Robustness Mechanisms:Forward Error Correction (FEC)Discontinuous Transmission (DTX)Packet Loss Concealment (PLC)
• Several good wideband codecs today, Opus expected to become pervasive due to WebRTC
• TC/CE based Endpoints support Opus
• Cisco want to support wide band as a minimum, direction is Opus for entire collaboration endpoint portfolio
Opus Summary
H.265 / HEVC
H.265
• H.265 is a video compression standard
• HEVC (High Efficiency Video Coding)
• MPEG-H Part 2
• H.264(AVC)’s successor
• Under joint development by Joint Collaborative Team on Video Coding (JCT-VC)
• ISO/IEC Moving Picture Experts Group (MPEG)
• ITU-T Video Coding Experts Group (VCEG)
• Focus on higher resolutions and framerates – mostly >=720p
• Target was approximately 50% bitrate reduction over H.264 at a “reasonable” increase in complexity
H.265 Encoder Complexity
Bandwidth
Encoder complexity
H.264
(base profile)
H.264 HP
(high profile)
H.265 HEVC
100%50%
1x
2x
5x
History of Video Compression Standardisation
Year ITU-T Neutral name ISO/IEC
1988 H.261 MPEG-1
1996 H.262, H.263 MPEG-2
1998 H.263+ MPEG-4 Part 2
2000 H.263++
2003 H.264 AVC MPEG-4 Part 10
2007 H.264 SVC AVC SVC MPEG-4 Part 10 SVC
2009 H.264 MVC AVC MVC MPEG-4 Part 10 MVC
2013 H.265 HEVC MPEG-H
2014 H.265 SVC/MVC HEVC SVC/MVC MPEG-H MVC/SVC
H.264 and H.265 Profiles
• H.264 – AVC – MPEG-4
• “Family of standards”
• Profiles are “family members”
• Profiles define coding tools and algorithms
• H.264 Profiles
• 2003: 3 profiles included same year as ratification (i.e. Baseline Profile)
• 2004: High Profile (HP)
• 2007: Scalable Video Coding (SVC)
• 2009: 16 profiles
• 2012: 21 profiles
• H.265 – HEVC – MPEG-H
• 2013: Main profile, Main 10 profile, Main still profile
• 2014: 24 additional profiles including 2 scalable profiles and one multi-view profile
• 2015: Screen content coding beingfinalised
Timeline
2003
H264
Baseline
‘05 ‘09 ‘10 ‘13
KTA
starts
Call for
Proposals
Work
startsV.1
‘14
V.2
• Version 1: regular 420 video, 8 and 10 bit
• Version 2: 3D Multiview, Range Extensions (high bit depth, 444, 422), Spatial
scalability
• V.3: Screen Content Coding being finalised
• Fast track for High Dynamic Range standardisation this year
• New KTA for H.266 being established
‘15
V.3
H.265 HEVC Encoder Architecture
Encoder architecture
Partition into blocks
Predict blocks from other blocks
Transform into frequencies
Quantize (lossy, like rounding)
Entropy code (lossless, like zip)
Reconstruct as decoder would
Inverse Transform
Combine prediction & residuals
Filter to remove artifacts
Decoder architecture
Entropy decode (like unzip)
Reconstruct as encoder would
Inverse Transform
Combine prediction & residuals
Filter to remove artifacts Source: “Overview of the HEVC Standard”, IEEE Trans. Cir. Sys. Video Tech., Vol. 22,
No. 12, Dec 2012
Predict
Entropy
code
Approximate
Reconstruct
Partition
• 1D Slices in raster scan order, same as H.264
• 2D Tiles for efficient memory bandwidth and parallel processing, resilience, and random access to regions of interest like faces
• Wavefronts for better compression efficiency and highly parallel processing
Video Compression Fundamentals: Partition
4 slices 9 tiles 4 threads in a wavefront
High-Level Partitions for application needs: parallelism, transport, resilience
Video Compression Fundamentals: Partition• Low-Level Partitions for coding efficiency to match content feature sizes
45Source: “HEVC Complexity and Implementation Analysis”, IEEE Trans. Cir. Sys. Video Tech., Vol. 22, No. 12, Dec 2012
H.264 16x16 Macro Blocks (MB)
Prediction up to 16x16, down to 4x4
Transform up to 8x8, down to 4x4
H.265 64x64 Coding Tree Units (CTU)
Prediction up to 64x64, down to 4x4
Transform up to 32x32, down to 4x4
Recursive quad-tree nesting of blocks
High Compression Efficiency
Larger block sizes
• 64x64 vs. 16x16 prediction (4x)
• 32x32 vs. 8x8 transform (4x)
• Recursive quadtrees vs. static sizes
• 10-20% bitrate savings
Better intra-picture spatial prediction
• 33 vs. 8 directional modes (4x)
• 15-20% bitrate savings
Better inter-picture temporal prediction
• Advanced Motion Vector Prediction (AVMP) with merging
• Higher sub-pixel precision and interpolation filters
Sample Adaptive Offset (SAO) filter for sharper edges, less banding and ringing
How is it achieved?
46
1080p30 H265@1mb/s vs H264@1mb/s
1080p30 H265@500kb/s vs. H264@1mb/s
Still Pictures
• H.265 Main Still Picture Profile
(intra-picture spatial prediction)
outperforms dedicated photo
formats as well as prior video
standards, in tests on 32 images.
• 43% smaller than JPEG
• 31% smaller than WebP
• 30% smaller than JPEG XR
• 23% smaller than JPEG 2000
• 16% smaller than H.264
49
Screen Content
• Class F test streams added to evaluate H.265 HEVC performance on screen content
• Coding tools adopted to improve screen content
• Lossless mode is pixel perfect
• Transform skip mode and flag
• Sample Adaptive Offset (SAO) filter to sharpen graphics/text edges
• RGB 4:4:4 colour format support expected in higher fidelity extensions
50
SlideEditing, 1280x720, 30 Hz
H.265/HEVC: The State of Play
Adoption of HEVC
• Increasingly supported by endpoints
• Widely supported in current generation smartphones via HW acceleration, e.g. iPhone 6
• We are likely to see HEVC via WebRTC
• Licensing is still uncertain
• MPEG-LA have terms (20c per codec), but not much IPR in the pool as yet
• Advance licence pool being formed, fees not yet announced
• Google continuing with VPx development, VP10 not far off
• IETF has launched NetVC codec development
H266 and all that
• Initial work on KTA for H265 began a decade ago
• A new KTA for H266 is likely to start soon
• Some contributions to ITU-T already showing ~10% gain:
• Even larger blocks (256x256)
• An additional loop filter
• Fancier motion modelling and motion data prediction
• Many companies actively researching this space
• A new standard could be in place by 2020
• Codec cycles are shortening
• Targets for UHD, WCG, HDR, 120-300 fps and more immersive experiences
Summary
• H.265 claims to cut BW requirements by 50%
• Improved quality by doubled resolution at the same bandwidth as of today
• Same quality experience at half the network cost
• Things take time
• Will not see this effect immediately – available in 2014, improving in 2015, common by late 2016
• Need new HW platforms – and we are seeing these emerging now
• Encoder optimisation is time consuming
Agenda
Protocols for Media
Protocols for Controls
Protocols for Cloud
Cisco Multistream
Scalable Coding
• Encode a high fidelity source using multiple layers of increasing fidelity
• Main motivation is scalable conference servers• Switching vs. transcoding, trading flexibility for scale and speed
• Other benefits include rate adaptation and error resilience
• Drawbacks include interoperability and lower coding efficiency
Base Layer with lowest fidelity 360p 30Hz 0.3Mb/s
Spatial Enhancement Layer to increase resolution 720p 30Hz 1.0Mb/s
Temporal Enhancement Layer to increase frame rate 720p 60Hz 1.5Mb/s
Quality Enhancement Layer to increase bit rate 720p 60Hz 2.0Mb/s
H.264 SVC in the Video Conferencing Industry
H.264 SVC Status and Challenges
- An emerging standard with benefits for balancing quality and bandwidth
- Loosely defined – each vendor has a different SVC implementation
- No backward compatibility - H.264 AVC is the industry norm
- Cisco H.264 SVC interoperability tested with Microsoft Lync 2013
B2B &
Intra-Enterprise
Interoperability?
H.264 SVC In the Industry
• Cisco WebEx has used H.264 SVC video for five years
• Cisco Video Conferencing Codecs (TC Software) all support native H.264 SVC as well as H.264 AVC
• Cisco VCS Control and VCS Expressway Plus the Cisco Expressway series all support H.264 SVC to AVC gateway functionality
Simulcast SVC (SSVC) UCconfig Mode 0
• Advantages: better interoperability,
lower aggregate and downstream bandwidth
• Drawbacks: upstream bandwidth overhead
360p video
HD
SD
CIF
Corporate LAN
Remote Office
Wifi Hotspot
Switch
(Simulcast SVC)
180p
360p
720p
• Media processing – doing stuff to the media packets through a media pipeline (often transcoding to a different codec, resolution, or quality)
• Switching – sending streams through a bridge without media processing
• Multistream – the ability to send and/or receive multiple streams to/from a single participant
• Simulcast – Packing together multiple streams within one pair of RTP/RTCP UDP network ports (both multiple resolutions/qualities and sources)
• Layout Composition – how the participants are presented (or not) when you see them in a meeting
Terminology
Transcoded Media PipelineDecode Scale Process Compose Encode
h264avc
Real Transcoded Media PipelineDecode Scale Process Compose Encode
h264avc
h264avc
h264avc
Switched Multistream Pipeline for Inputs Scale Process Compose Outputs
Hybrid Multistream PipelineInputs or
DecodeScale Process Compose Outputs or
Encode
Low bandwidth
3rd party/SIP/H323
h264avc
• Gets the scaling benefits of switching
• Supports natively standard SIP/H323, 3rd party, and interop
• Optimizes the user experience without the trade-offs of transcoding and switching
• Handles variance in bandwidth and packet loss
• Supports any device limitation (processing power, camera, screen size, bandwidth)
The Beauty of the Hybrid Media Pipeline
Hybrid Media Pipelines vTS, CMP, and Acano!
Switched
streams
through
bridge
Multiple
streams
simulcasted(multiplexed)
Media-
processing
possible
Layout
Composition
Standard
SIP N/A N/A Yes
On the bridge/
Always
composited
stream
Multistream
SIP Yes Yes Yes (if hybrid)
Locally rendered
on the endpoint
or
on the bridge
ONLY possible if you own BOTH endpoints and
infrastructure...!
Cisco MARI
MARI
Media Adaptation and Resiliency Implementation
Managed vs. Unmanaged NetworksWhere do your media packets go?
Call Control
Remote Sites
Central
Site
On-premiseUC Services
MPLSVPN
Cloud Services
ManagedWAN Internet
DMVPN
B2B
B2C
Home/Mobile Users
QoS-capable
How do you preserve user
experience when media
traverses the Internet?
Evolution of Collaboration Media Streams
MultipointBridge Multipoint
Bridge
Temporal
layers
Collaboration
data
Multi-device
sessions
Active cascading
Simulcast
multistreaming
Adaptive
video bitrate
Our Strategy
“Smart” Media Techniques QoS Tools
• Use media resilience to reduce impact of packet loss
• Apply rate adaptation to reduce network congestion
• Consolidate mechanisms to identify Collaboration media
• Evolve classification and scheduling recommendations
Video
Queue
EF
EF
AF42
AF41
AF41
AF42
AudioQueue
WA
N L
ink
...
?
P1
LTRF1
P2P3
P4
P5
... ...
P1
LTRF1
P2 P4
...Encoder Decoder
P5
ACK LTRF1OOS (P4)
R2...
LTRF
Repair-P
...
Encoder Decoder
0111010001
1000011001
0001100
1110010101
1011010010
1010010
1001000100
0011001011
1011110
R1 FEC
FECR1
R2
Leverage media resilience and rate adaptation to enable pervasive video deployments through:
• simplified provisioning
• optimized bandwidth utilization
Design & Deployment
240p15 (150 kbps) 1080p60 (6 Mbps)
Operational bandwidth
Bandwidth
Time
G.729 (24 kbps)
AAC-LD (160 kbps)
Operational bandwidth
Bandwidth
Time
AUDIO
VIDEO
Bandwidth:
– Constant bitrate (smooth)
– Small footprint
– Narrow operational range (1:6)
Loss-sensitive
Delay-sensitive
Bandwidth:
– Variable bitrate (bursty)
– Medium/large footprint
– Wide operational range (1:40)
Loss-sensitive
Delay-sensitive
Video Traffic: Requirements and Profiles
Video TrafficVideo Encoding Basics
75
1
3
2
I-Frame“Intra-coded” picture
Entire picture encoded as a static image
No reference to other frames
1
P-Frame“Predicted” picture
Based on a previously encoded frame ( )
Only the differences from that frame are encoded
2
1
P-Frame“Predicted” picture
Reference for prediction can be another P-Frame ( )
3
2
20 ms
Audio Packets
Bytes
200
600
1000
Audio
Samples
1400
Time 33 ms
Video Packets
P-Frame I-Frame P-Frame
200
600
1000
1400
Video TrafficAudio vs. Video Packet Distribution
Video Traffic
77
0
500
1000
1500
2000
2500
3000
3500
HD video call, 720p30 @ 1920 kbps (1792 kbps video + 128 kbps audio) Video bandwidth shown (including L3 overhead)
Bandw
idth
(kbps)
Time (s)
Bandwidth Usage: High-definition Video Call
I-Frames
Video TrafficImpact of Packet Loss on a Video Stream
78
...
?
P1
I1
P2P3
P4 P5
P3
Out of Sync (OOS)
P1P2P4I1 I1 I1P5
... ...P1
I1
P2 P4 P5...
Encoder Decoder
Frozen video
Artifacts
Video
Pulsing
Loss of a P-frame triggers request for a new I-frame
– Encoding and transmitting large I-frame takes time
– If any of the I-frame packets get lost, the process needs to restart
– I-frame creates burst that risks exacerbating network congestion (more packet loss!)
Flickering/pulsing of video when new I-frame arrives
– Video freeze or artifacts when multiple packets are lost
“Smart” Media TechniquesGoals and Solutions
80
Make network congestion less likely to occur
Recover more efficiently from packet loss
Optimize use of available network resources
Goals
Media Resilience
Rate Adaptation
Mechanisms
Encoder Pacing
GDR
LTRF with Repair
FEC
Media ResilienceEncoder Pacing
Each frame must be packetized onto the wire in 33 ms
Endpoint packet scheduler disperses packets as evenly as possible
Large I-frames may need to be “spread” over 2 or 3 frame intervals
Encoder may then ‘skip’ 1-2 frames to stay within bitrate budget
P-Frame P-FrameI-Frame
33 ms
P-Frame P-FrameI-Frame
200
600
1000
1400
Time
P-Frame
Bytes
200
600
1000
1400
Bytes
Time33 ms
Avoiding Packet LossGradual Decoder Refresh (GDR)
82
New I-frame causes traffic burst, which in turn can generate congestion– If one I-frame packet gets dropped, the whole frame needs to be retransmitted
Gradual Decoder Refresh spreads “intra”-encoded picture data over N frames– GDR frames contain a portion of “intra” macroblocks and a portion of predicted
macroblocks
– Once all GDR frames have been received, decoder has fully refreshed the picture
Encoder
Decoder
Predicted portion
“Intra”-macroblock portion
Media ResilienceLong Term Reference Frame (LTRF) with Repair
Keep encoder and decoder in sync with active feedback messages– Encoder instructs decoder to store raw frames at specific sync points as Long-Term Reference
Frames (part of H.264 standard)
– Decoder uses “back channel” (i.e. RTCP) to acknowledge LTRF’s
When a frame is lost, encoder creates a “Repair” P-frame based on the last synchronised LTRF instead of generating a new I-frame
...
?
P1
LTRF1
P2P3
P4
P5
P3 P1P2P4P5
... ...P1
LTRF1
P2 P4...
Encoder Decoder
P5
ACK LTRF1
Long-Term Reference Frame
(not actually sent on the wire)
Repair P-Frame
Built from last sync’ed LTRF
OOS (P4)
Media ResilienceForward Error Correction (FEC)
84
Allows decoder to recover from limited amount of packet loss without losing synchronization
Can be applied at different levels (x FEC packets every N data packets) to protect “important” frames in lossy environments
Correction code can be basic (binary XOR) or more advanced (Reed-Solomon)
Trade-off is bandwidth increase—best suited for non-bursty loss
R2
...
LTRF
Repair-P
...
Encoder Decoder
011101000
110000110
010001100
111001010
110110100
101010010
100100010
000110010
111011110
Binary XOR R1 FEC
Binary
XORFEC
R1
R2
Sender Receiver
Video
Bitrate
Packet
Loss
t1 t2t1 t2
RTCP
Rate AdaptationKey Idea
Receiver observes delay and packet loss over periods of time and signals back using RTCP Receiver Reports (RR)
Reports cause the sender to adjust bitrate so as to adapt to network conditions (downspeeding, upspeeding)
Two approaches possible:
– Sender-initiated adjustment based on RTCP Receiver Reports
– Receiver-initiated adjustment via call signaling (H.323 flow control, TMMBR, SIP Re-invite) or explicit request in RTCP message
RR 1RR 2RR 3
SLOW
DOWN
“Smart” Media TechniquesSupport in Cisco Collaboration Devices
86
Endpoint / BridgeEncoder
Pacing
Rate
AdaptationFEC LTRF Repair
89xx, 99xx future future --
DX future future
WebEx future
TX/IX future
Jabber
C/EX/MX/SX/Profile
TS (3.1) (3.1)
MCU (4.5) (4.5)
ClearPath
“Smart” Media Techniques
• Burstiness of traffic and mobility of the endpoints make deterministic
provisioning for interactive video difficult for network administrators
• Media resilience mechanisms help mitigate impact of video traffic on the
network and impact of network impairments on video
• Dynamic rate adaptation creates an opportunity for more flexible provisioning
models for interactive video in Enterprise networks
• Media resilience and rate adaptation also help preserve user experience when
video traffic traverses the Internet or non-QoS-enabled networks
Key Takeaways
88
Current QoS ApproachesClassification and Scheduling Considerations
90
Same DSCP for audio and video streams of a video call
– During congestion, audio and video streams are equally impacted
Different DSCP’s for audio streams in video calls vs. voice calls
– Media stream identification difficult for multi-media mobile clients
Different queues for immersive/ room system video and desktop video
– Complex provisioning, sub-optimal bandwidth usage
CBWFQ
PQAudio ofvoice call
Audio ofTelepresence
Video ofTelepresence
EF
CS4
CS4
oth
er q
ue
ue
s
CBWFQ
Audio ofDesktop video
Video ofDesktop video
AF41
AF41
WA
N L
ink
Policer
QoS Tools• Evolution of Classification Recommendations
Audio stream
Audio stream
Video stream
EF
CS4
CS4
Audio stream
Video stream AF41
Telepresence(CTS, TX, EX, C, MX, Profile, SX)
Voice phones
Software/mobile(Jabber clients)
Desktop video (99xx, 89xx, DX)
Audio stream
Video stream
EF
AF42
Telepresence(CTS, TX, EX, C, MX, Profile, SX)
Voice phones
Software/mobile(Jabber clients)
Desktop video (99xx, 89xx, DX)
Previous New
“Opportunistic”
Multimedia
Conferencing
Multimedia
Conferencing
Real-Time
Interactive
VoIP Telephony
AF41EF
91
QoS Tools• Evolution of Queuing Recommendations
(PQ)
CBWFQ
PQAudio ofvoice call EF
oth
er q
ue
ue
s
CBWFQ
Audio ofDesktop video
Video ofDesktop video
AF41
AF41
WA
N L
ink
Policer
Audio ofTelepresence
Video ofTelepresence
CS4
CS4(Policer)
Previous New
PQ
Audio of IP Phone
oth
er q
ue
ue
s
EF
AF41
Audio of Video
Video of Video Video
CBWFQ
BW
As
sig
ne
d to
LL
Q C
las
se
s
EF
AF42
Audio of Jabber
Video of Jabber
AF41 WRED thresholds(i.e., drop AF41 last)
AF42 WRED thresholds(i.e., drop AF42 first)
EF
EF
92
Custom QoS settings for SIP Devices
Registration
Config File
UC Video
Endpoints
TelePresence
Endpoints
UC Video Applicable DSCP settings:
DSCP for Audio Calls
DSCP for Video Calls
DSCP for Audio Portion of Video Calls
TelePresence Applicable DSCP settings:
DSCP for Audio Calls
DSCP for TelePresence Calls
DSCP for Audio Portion of TelePresence Calls
DSCP for Audio Calls
DSCP for Video Calls
DSCP for Audio Portion of Video Calls
DSCP for TelePresence Calls
DSCP for Audio Portion of TelePresence Calls
…
Clusterwide Parameters (System – QoS)
Unified CM
SIP Profile
QoS
Service
ParametersDevice
New in 11.0
New
(CUCM 11.0)
Custom QoS Settings For SIP Devices
SIP Profile (Defaults Modified for Example)
New
(CUCM 11.0)
Custom QoS settings for SIP Devices
TelePresence
Endpoints
Jabber
Clients
Desktop
Video
Endpoints
Prioritized Video: AF41 “Opportunistic” Video: AF42
New
(CUCM 11.0)
SIP
Pro
file
1
SIP
Pro
file
2
Network Integration – SDNDynamic Policy Management for Untrusted Devices (e.g., Jabber Clients)
CUCM
Traffic Queuing
ApplicationDynamic Policy
Management
Jabber ClientJabber Client
CUCM
Cisco® APIC
Enterprise Module EM
See BRKCOL-2616, “Enabling
Quality of Service with Cisco
SDN (2016 Melbourne)”
Thursday Mar.10th at 12:50pm
Agenda
Protocols for Media
Protocols for Controls
Protocols for Cloud
WebRTC
About WebRTC
• What is WebRTC:
• WebRTC is an API definition being drafted by the World Wide Web Consortium (W3C)
• It is a free, open project that enables web browsers with Real-Time Communications (RTC) capabilities via simple JavaScript APIs
• What is the merit of WebRTC:
• WebRTC enables applications such as voice calling, video chat and P2P file sharing inside the browsers without plugins (or separate clients)
Interactive Voice and Video in your Browser Today...
But...• Proprietary – no
interoperability
• Requires 3rd party plugins
• Difficult to deploy (permissions, etc...)
• Not available on all platforms
Different Browsers, Different Plugins
• NPAPI – Netscape Plugin API• A cross platform browser plugin architecture in:
• Chrome
• Firefox
• Safari
• ActiveX• A browser plugin architecture created by Microsoft based on its COM (Common Object
Model) and OLE (Object Linking and Embedding) technologies• Internet Explorer
Browser Plugin Technologies stem from developments in the mid-nineties
“Today’s browsers are speedier, safer, and more capable than their ancestors. Meanwhile, NPAPI’s 90s-era architecture has become a leading cause of hangs, crashes, security incidents, and code complexity. Because of this, Chrome will be
phasing out NPAPI support over the coming year.”
http://blog.chromium.org/2013/09/saying-goodbye-to-our-old-friend-npapi.html
And Mobile Browsers Are Not Extensible• Native mobile apps are required
Key Features• Media Stream:
• WebRTC can carry a media source containing one or more synchronised Media Stream Tracks
• Media should be converted to URL to be played by HTML5
• Get User Media: for capturing video and audio from webcam and microphone
• Peer Connection: high quality peer to peer easy audio/video calls
• Peer-to-peer
• Codec Control
• Encryption
• Bandwidth Management
• Data Channels: • p2p application data transfer (not supported
by any browser yet)
WebRTC Standards
Standards Efforts
• RTCWeb Working Group
‒ Cullen Jennings of Cisco is co-chair
• Defining how browsers communicate with
others … largely re-using existing protocols
• Notable documents …
draft-ietf-rtcweb-audio draft-ietf-rtcweb-data-channel
draft-ietf-rtcweb-jsep draft-ietf-rtcweb-overview
draft-ietf-rtcweb-qos draft-ietf-rtcweb-rtp-usage
draft-ietf-rtcweb-security-arch
draft-ietf-rtcweb-use-cases-and-requirements
• WebRTC Working Group
‒ Cullen Jennings co-authors RTCWeb draft
‒ Keith Griffin co-authors Screen Share draft
• Defining how Web applications access
browser real-time communications, i.e. API’s
• Notable documents …
‒ WebRTC 1.0: Real-time Communication Between
Browsers
‒ Media Capture and Streams
‒ Media Capture Scenarios
WebRTC Native Browser Architecture
WebRTC Javascript API
WebRTC Native API (C++)
Session Management (SDP)
Voice Codecs
Noise
Reduction
Echo
Cancellation
Voice Engine
Video Codec
Jitter Buffer
Image
Enhancements
Video Engine
Encryption /
Security
Multiplexing
Connectivity
ICE, STUN, TURN
Transport
Adapted from WebRTC architecture diagram
Collaboration Apps
WebRTC
Packetization
WebRTC Video Codec MTI Debate
• MTI = Mandatory to Implement
• Google proposed VP8 codec
• Other industry players proposed H.264
• 2 year standoff
• November 2014 decision – BOTH codecs are MTI
Project Thor
“a Project to Hammer Out a Royalty Free
Video Codec”
Focused on developing next generation media formats, codecs and technologies in the public interest.
Founding members:
Cisco(Thor), Amazon, Netflix, Microsoft, Intel, Mozilla(Daala), Google(VP9)
http://aomedia.org
Open. Fast. Royalty-free.
Standards Technology Progress
CONVERGING
• Audio Codecs.. G.711, Opus
• Signaling: SDP-based offer/answer using JavaScript
• Firewall/NAT Traversal … ICE, STUN, TURN
• Media Encryption: DTLS-keyed SRTP
• Media Consent: ICE/STUN
• Identity: Identity Provider Model
• QoS … DiffServ Code Point markings to enhance
WiFi, residential GWs, LTE links
• Both Video Codec(s) VP8 Vs H.264 are supported,
IETF Decision Made and implementation underway
‒ http://www.ietf.org/mail-
archive/web/rtcweb/current/msg13432.html
‒ OpenH264
Working
• Cisco Announced Free OpenH264 Project
• Mozilla Firefox using OpenH264
• Chrome H.264 implementation underway
• Congestion Control …
‒ Goals = minimize latency, quick reaction,
consistent data flow
• Screen/Application Sharing
• Source:
• iswebrtcreadyyet.com
Browser Support2015
• Source:
• iswebrtcreadyyet.com
Browser Support2016
Browser Market Share
http://en.wikipedia.org/wiki/Usage_share_of_web_browsers
Source ChromeInternet
Explorer
Firefo
xSafari
OperaOther
s
Wikimedia 48.1% 17.5% 16.7% 4.8% 1.5% 11.4%†
W3Counter 42.5% 17.6% 15.6% 14.6% 3.2% 6.5%
StatCounter 49.7% 24.6% 18.0% 4.7% 1.5% 1.6%
NetApplicati
ons22.7% 59.1% 11.9% 5.0% 0.9% 0.4%
Usage share of PC browsers for December 2014
Browser to Non-Browser EndpointHigh-level Real-time Communications Architecture
116
Web Server
Web App via HTTP/HTTPS
(e.g. HTML, CSS, JavaScript)
Voice, Video via SRTP
SIP ProxyGW to SIP
SIP
Jabber Guest Solution
Expressway
Core/ VCS -C
Expressway
Edge/ VCS -E
HTTP-based call control (ROAP)
SIP
RTP/SRTP
STUN/TURN
Jabber Guest …
Serves up Javascript call control based on URL
For mobile, uses Cisco® app from app store or integrates it into a third-party app
For laptop browsers, initiates H.264 plugin install as needed for Cisco or third-
party web app
Converts HTTP call request to SIP INVITE
Home Internet DMZ Enterprise
Jabber® Guest Cisco UCM
What can we really do with this technology?
REM Solution Architecture
Remote Expert
Mobile Media Broker
EnterpriseDMZInternetHome, Wi-Fi or 4G
Mobiles
HTTPS/WSS
HTTPS
CUBE-E
Cisco Unified CM Cluster
Cisco Unified Contact Centre
(UCCX or P/UCCE)
EPsMedia (Voice/Video)
SIP
HTTP
SIP / SIP TLS
RTPDTLS / sRTP
Enterprise
Application
Server
Web & Mobile
Apps
HTTP/S
CTI/Data
HTTPS/WSS
SIP
SIP
RTP
Remote Expert
Mobile Application
Server
REAS
REMB
CSDKEnterprise
Reverse Proxy
RP
Browsers
Cisco Finesse
WebRTC is Real
Summary for WebRTC
• WebRTC can change the way we communicate in browsers, mobile and fixed endpoints.
• Standards and Industry Direction continues to evolve
• Emerging interoperable proof points
• Enabling real product development as browsers adopt
• Progress and adoption is good but much more to do.
• Dependency on browser adoption and more
Closing Thought…
Q & A
Complete Your Online Session Evaluation
Learn online with Cisco Live!
Visit us online after the conference
for full access to session videos and
presentations.
www.CiscoLiveAPAC.com
Give us your feedback and receive a
Cisco 2016 T-Shirt by completing the
Overall Event Survey and 5 Session
Evaluations.– Directly from your mobile device on the Cisco Live
Mobile App
– By visiting the Cisco Live Mobile Site http://showcase.genie-connect.com/ciscolivemelbourne2016/
– Visit any Cisco Live Internet Station located
throughout the venue
T-Shirts can be collected Friday 11 March
at Registration
Thank you