some interesting directions in network coding
DESCRIPTION
TRANSCRIPT
Some interesting directions in network coding
Muriel Médard
Electrical Engineering and Computer Science Department
Massachusetts Institute of Technology
Collaborators MIT LIDS: Minji Kim, Minkyu Kim, Anna Lee, Devavrat Shah, Jay-
Kumar Sundararajan MIT CSAIL: Varun Aggarwal, Wenjun Hu, David Karger, Dina
Katabi, Sachin Katti, Ben Leong, Una-May O’Reilly, Hariharan Rahul
MIT Broad Institute: Desmond Lun (previously LIDS) Technical University of Munich: Ralf Koetter (previously UIUC) University of Illinois Urbana-Champaign: Danail Traskov California Institute of Technology: Michelle Effros, Tracey Ho
(previously MIT LIDS, UIUC, Lucent) Ecole Polytechnique Federale Lausanne (Switzerland): Christina
Fragouli Digital Fountain: Payam Pakzad (previously EPFL) Samsung Advanced Institute of Technology: Chang Wook Ahn BBN: Karen Haigh, Paul Rubel Qualcomm: Niranjan Ratnakar (previously UIUC)
Overview
Basic overview of network coding Network coding for erasures Limited network coding Network coding in multi-source multicast Network coding beyond multicast
Increasing functionality of network coding
Network coding
s
t u
y z
w
b1
b1
b1
b2
b2
b2
x
Canonical example [Ahslwede et al. 00]
What choices can we make?
No longer distinct flows, but information
Network coding
s
t u
y z
w
b1
b1
b1
b1
b2
b2
b2
x
b1b1
Picking a single bit does not work
Time sharing does not work
No longer distinct flows, but information
Network coding
s
t u
y z
w
b1
b1
b1
b1 + b2
b2
b2
b2
x
b1 + b2b1 + b2
Need to use algebraic nature of data
No longer distinct flows, but information
Must we consider the optimization of codes and network usage jointly?
Randomized network coding- multicast
To recover symbols at the receivers, we require sufficient degrees of freedom – an invertible matrix in the coefficients of all nodes
The realization of the determinant of the matrix will be non-zero with high probability if the coefficients are chosen independently and randomly
Probability of success over field F ≈
Randomized network coding can use any multicast subgraph which satisfies min-cut max-flow bound [Ho et al. 03] any number of sources, even when correlated [Ho et al. 04]
jjjmhj
hjjmij
ijjmjm XYYY
j
ijYhjY
jX
Endogenous inputs
Exogenous input
F
11
Erasure reliability – single flow
End-to-end erasure coding: Capacity is packets per unit time.
As two separate channels: Capacity is packets per unit time.- Can use block erasure coding on each channel. But delay is a problem.
Network coding: minimum cut is capacity- For erasures, correlated or not, we can in the multicast case deal with average flows uniquely [Lun et al. 04, 05], [Dana et al. 04]:
- Nodes store received packets in memory- Random linear combinations of memory contents sent out - Delay expressions generalize Jackson networks to the innovative packets- Can be used in a rateless fashion
BCAB 11
BCAB 1,1min
Feedback for reliability
Parameters we consider:• delay incurred at B: excess time, relative tothe theoretical minimum, that it takes for k packetsto be communicated, disregarding any delay due tothe use of the feedback channel• block size• feedback: number of feedback packets used(feedback rate Rf = number of feedback messages / number of received packets)• memory requirement at B• achievable rate from A to C
Feedback for reliability
Follow the approach of Pakzad et al. 05, Lun et al. 06
Scheme V allows us to achieve themin-cut rate, while keeping the average memoryrequirements at node B finite
note that the feedback delay for Scheme V issmaller than the usual ARQ (with Rf = 1) by afactor of Rf
feedback is required only on link BC
Fragouli et al. 07
Interesting directions
Practical code design: Using small generation sizes may reduce the throughput and erasure-
correcting benefits of mixing information packets Large generation sizes may incur unacceptable decoding delay at the
receivers Can we consider issues of delay, memory and feedback overhead for
interesting code designs? How do we take these issues into account when we use multicast rather than
single flow approaches? Parameter adaptation for delay-sensitive applications:
Feedback from the receivers to the source can be used to adjust adaptively the generation size and maximize the number of packets successfully decoded within the delay specifications.
The source response to this type of feedback is similar to TCP windows Can we build an entire TCP-style suite for single network coded flows?
Errors – see Ralf’s talk!
Difficulty of not allowing coding everywhere: Finding a minimal set of coding nodes or links is NP-hard Finding multicast codes when some nodes are not able to code is difficult
We associate a binary variable with each coefficient at a merging node
0 is zeroed,
1 remains indeterminate.
For each assignment of binary values to the variables, we can verify the achievability of the target rate R and determine whether coding is required.
56
4
2
31
0 1 0 1 1 0
indicates the associated coefficient
Limited network coding with multicast
LATA-X and ISP 1755 (Ebone) from Rocketfuel Project Randomly generated connected acyclic directed graphs with (20
nodes, 12 sinks, rate 4) and (40 nodes, 12 sinks, rate 3) Minimal 1 greedy approach [Fragouli Soljanin 06] Minimal 2 greedy approach [Langberg et al. 05]
Proposed GA
Minimal 1
Minimal 2
0
0
0
Best
0.35
0.90
1.10
Avg
0
0
0
Best
0.25
1.05
0.80
Avg
0
0
0
Best
1.20
1.35
1.85
Avg
0
1
0
Best
1.05
1.85
1.90
Avg
LATA-X ISP 1755 (20,12,4) (40,12,3)
Ratio 1 0.39 1 0.31 1 0.89 0 0.57
Performance of genetic algorithms
Decentralized operation Populations can be managed locally Cross-overs and mutations can be managed locally also Some coordination is required for
Fitness value calculation - feedback can be done efficiently Selection and pairing of chromosomes - can be calculated at the
source and transmitted with the data on renewal of each generation.
1 0 0 1 1 1 0 1 1 0 1 1 0 1
1 0 0 1 1 1
0 1 1 0 1 1 0 1
0 0 1
1 1 1
… 0 1 0
1 0 0
… 1 1
0 1
… 0 0
0 1
… 1 1
0 1
… 1 0
1 1
…
0 0 11 1 1
0 1 01 0 0
1 10 1
0 00 1
1 10 1
1 01 1
< Population >
Interesting problems
Interaction of coding and non-coding nodes: Should they just co-exist or cooperate? What algorithms can solve a joint routing/coding
problem (in effect a constrained multicast network coding problem)?
Coding as a resource: Can we determine how to place our coding
resources in the network? Should we turn coding on as needed?
Network coding – source coding -cooperation confluence Network coding and distributed compression are
intimately linked [Ho et al. 04] – we may envisage Network coding for correlated sources can make use
of naturally occurring correlation Designing sources with correlation rather than
straightforward replication as is done currently in mirrors
Coding and decoding melds erasure coding, multicast coding and compression
Rather than consider only shedding redundancy in networks, network coding points to using it and designing it intelligently
Optimization for multicast network coding
(1,1,0)
(1,0,1)
(1,1,0)
(1,0,1)
(1,1,1)(1,1,0)
(1,0,1)
=
source
sink
(1,1,1)
(1,1,1)
Index on receiversrather than on processes [Lun et al. 04]
Steiner-tree problem can be seen to be this problemwith extra integrality constraints
Joint versus separate coding
for each link (R = 3)
Joint (cost 9) Separate (cost 10.5)
[Lee at al. 07]
Interesting directions
Making use of the joint coding: Complexity goes up with the number of sources How much better does this perform than doing Slepian-Wolf first,
followed by routing or network coding? How dependent is the design on knowing actual correlation
parameters? Practical code design for such schemes:
Achievability comes from random code construction, uses minimum-entropy decoding
Can we use the practical techniques that have yielded good results in Slepian-Wolf in this type of network coding?
Generalize mirror site design: Do not copy a whole site, but just certain portions How does this affect the storage in and operation of networks?
Going beyond multicast Can create algebraic setting for linear non-multicast connections [Koetter Medard
02,03] In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] Limited code approaches: ability to use XOR
Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06]
End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches These approaches outperform routing by trivially subsuming it Generalizations to codes including more flows, intermediate decoding points or codes
beyond beyond XORs can be envisaged A plethora of elaborations can be developed, leading to increased complexity
with further benefits – trade-off unclear
0200
400600800
100012001400
16001800
1 3 5 7 9 11 13 15 17 19 21
No Coding
Our Scheme Net throughput (KB/s)
Number of flows
a b
Going beyond multicast Can create algebraic setting for linear non-multicast connections [Koetter Medard
02,03] In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] Limited code approaches: ability to use XOR
Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06]
End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches These approaches outperform routing by trivially subsuming it Generalizations to codes including more flows, intermediate decoding points or codes
beyond beyond XORs can be envisaged A plethora of elaborations can be developed, leading to increased complexity
with further benefits – trade-off unclear
0200
400600800
100012001400
16001800
1 3 5 7 9 11 13 15 17 19 21
No Coding
Our Scheme Net throughput (KB/s)
Number of flows
aa b
Going beyond multicast Can create algebraic setting for linear non-multicast connections [Koetter Medard
02,03] In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] Limited code approaches: ability to use XOR
Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06]
End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches These approaches outperform routing by trivially subsuming it Generalizations to codes including more flows, intermediate decoding points or codes
beyond beyond XORs can be envisaged A plethora of elaborations can be developed, leading to increased complexity
with further benefits – trade-off unclear
0200
400600800
100012001400
16001800
1 3 5 7 9 11 13 15 17 19 21
No Coding
Our Scheme Net throughput (KB/s)
Number of flows
a ba b
Going beyond multicast Can create algebraic setting for linear non-multicast connections [Koetter Medard
02,03] In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] Limited code approaches: ability to use XOR
Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06]
End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches These approaches outperform routing by trivially subsuming it Generalizations to codes including more flows, intermediate decoding points or codes
beyond beyond XORs can be envisaged A plethora of elaborations can be developed, leading to increased complexity
with further benefits – trade-off unclear
0200
400600800
100012001400
16001800
1 3 5 7 9 11 13 15 17 19 21
No Coding
Our Scheme Net throughput (KB/s)
Number of flows
a b
a+b
a+b a+b
a b
A principled optimization approach to match or outperform routing An optimization that yields a solution that is no worse
than multicommodity flow The optimization is in effect a relaxation of
multicommodity flow – akin to Steiner tree relaxation for the multicast case
A solution of the problem implies the existence of a network code to accommodate the arbitrary demands – the types of codes subsume routing
All decoding is performed at the receivers We can provide an optimization, with a linear code
construction, that is guaranteed to perform as well as routing [Lun et al. 04]
Optimization gives a set partition of {1, . . . ,M} that represents the sources that canbe mixed (combined linearly) on links going into j
Demands of{1, . . . ,M} at t
Optimization for arbitrary demands with decoding at receivers
Coding and optimization
Sinks that receive a source process in C by way of link (j, i) either receive all the source processes in C or none at all
Hence source processes in C can be mixed on link (j, i) as the sinks that receive the mixture will also receive the source processes (or mixtures thereof) necessary for decoding
We step through the nodes in topological order, examining the outgoing links and defining global coding vectors on them (akin to [Jaggi et al. 03])
We can build the code over an ever-expanding front We can go to coding over time by considering several flows for
the different times – we let the coding delay be arbitrarily large The optimization and the coding are done separately as for
the multicast case, but the coding is not distributed
Fix the code approach – conflict hypergraph There may occasions when we are not willing to go to infinite code lengths, or the
types of codes may be pre-determined in our network, with different codes at different nodes
In that case, we can adopt a conflict hypergraph representation of the effects of coding and allowable rate regions together
Recent development for considering intrinsic multicast in switches [Sundarajan et al. 04] and special fabrics [Caramanis et al. 04]
Provides a systematic approach of representing the capacity region of a coded system for arbitrary codes
Vertices: Define one vertex for each possible “composition of information” on every link The composition of information on a link is the net transfer function from the source
messages to the symbol sent on the link Edges:
In a valid code, more than one vertex cannot be chosen corresponding to each link If the composition on an outgoing link at a node is incompatible with a set of incoming input
compositions, then the corresponding vertices are connected by a hyperedge Natural extension of switching approaches in networks
Interesting directions
Design of codes: How far should we go? What are the advantages and disadvantages of fixing the
lengths and fields ahead of time? Should be looking at non-linear codes? Can we find some distributed approaches?
Performance evaluation: Can we use properties of certain conflict graphs to obtain
capacity regions? Can we generalize the optimization approach, for instance
when certain nodes can do intermediate decoding?
Interesting problems