some interesting directions in network coding

Some interesting directions in network coding

Muriel Médard

Electrical Engineering and Computer Science Department

Massachusetts Institute of Technology

Collaborators MIT LIDS: Minji Kim, Minkyu Kim, Anna Lee, Devavrat Shah, Jay-

Kumar Sundararajan MIT CSAIL: Varun Aggarwal, Wenjun Hu, David Karger, Dina

Katabi, Sachin Katti, Ben Leong, Una-May O’Reilly, Hariharan Rahul

MIT Broad Institute: Desmond Lun (previously LIDS) Technical University of Munich: Ralf Koetter (previously UIUC) University of Illinois Urbana-Champaign: Danail Traskov California Institute of Technology: Michelle Effros, Tracey Ho

(previously MIT LIDS, UIUC, Lucent) Ecole Polytechnique Federale Lausanne (Switzerland): Christina

Fragouli Digital Fountain: Payam Pakzad (previously EPFL) Samsung Advanced Institute of Technology: Chang Wook Ahn BBN: Karen Haigh, Paul Rubel Qualcomm: Niranjan Ratnakar (previously UIUC)

Overview

Basic overview of network coding Network coding for erasures Limited network coding Network coding in multi-source multicast Network coding beyond multicast

Increasing functionality of network coding

Network coding

s

t u

y z

w

b1

b1

b1

b2

b2

b2

x

Canonical example [Ahslwede et al. 00]

What choices can we make?

No longer distinct flows, but information

Network coding

s

t u

y z

w

b1

b1

b1

b1

b2

b2

b2

x

b1b1

Picking a single bit does not work

Time sharing does not work


Network coding

s

t u

y z

w

b1

b1

b1

b1 + b2

b2

b2

b2

x

b1 + b2b1 + b2

Need to use algebraic nature of data


Must we consider the optimization of codes and network usage jointly?

Randomized network coding- multicast

To recover symbols at the receivers, we require sufficient degrees of freedom – an invertible matrix in the coefficients of all nodes

The realization of the determinant of the matrix will be non-zero with high probability if the coefficients are chosen independently and randomly

Probability of success over field F ≈

Randomized network coding can use any multicast subgraph which satisfies min-cut max-flow bound [Ho et al. 03] any number of sources, even when correlated [Ho et al. 04]

jjjmhj

hjjmij

ijjmjm XYYY

j

ijYhjY

jX

Endogenous inputs

Exogenous input

F

11

Erasure reliability – single flow

End-to-end erasure coding: Capacity is packets per unit time.

As two separate channels: Capacity is packets per unit time.- Can use block erasure coding on each channel. But delay is a problem.

Network coding: minimum cut is capacity- For erasures, correlated or not, we can in the multicast case deal with average flows uniquely [Lun et al. 04, 05], [Dana et al. 04]:

- Nodes store received packets in memory- Random linear combinations of memory contents sent out - Delay expressions generalize Jackson networks to the innovative packets- Can be used in a rateless fashion

BCAB 11

BCAB 1,1min

Feedback for reliability

Parameters we consider:• delay incurred at B: excess time, relative tothe theoretical minimum, that it takes for k packetsto be communicated, disregarding any delay due tothe use of the feedback channel• block size• feedback: number of feedback packets used(feedback rate Rf = number of feedback messages / number of received packets)• memory requirement at B• achievable rate from A to C

Feedback for reliability

Follow the approach of Pakzad et al. 05, Lun et al. 06

Scheme V allows us to achieve themin-cut rate, while keeping the average memoryrequirements at node B finite

note that the feedback delay for Scheme V issmaller than the usual ARQ (with Rf = 1) by afactor of Rf

feedback is required only on link BC

Fragouli et al. 07

Interesting directions

Practical code design: Using small generation sizes may reduce the throughput and erasure-

correcting benefits of mixing information packets Large generation sizes may incur unacceptable decoding delay at the

receivers Can we consider issues of delay, memory and feedback overhead for

interesting code designs? How do we take these issues into account when we use multicast rather than

single flow approaches? Parameter adaptation for delay-sensitive applications:

Feedback from the receivers to the source can be used to adjust adaptively the generation size and maximize the number of packets successfully decoded within the delay specifications.

The source response to this type of feedback is similar to TCP windows Can we build an entire TCP-style suite for single network coded flows?

Errors – see Ralf’s talk!

Difficulty of not allowing coding everywhere: Finding a minimal set of coding nodes or links is NP-hard Finding multicast codes when some nodes are not able to code is difficult

We associate a binary variable with each coefficient at a merging node

0 is zeroed,

1 remains indeterminate.

For each assignment of binary values to the variables, we can verify the achievability of the target rate R and determine whether coding is required.

56

4

2

31

0 1 0 1 1 0

indicates the associated coefficient

Limited network coding with multicast

LATA-X and ISP 1755 (Ebone) from Rocketfuel Project Randomly generated connected acyclic directed graphs with (20

nodes, 12 sinks, rate 4) and (40 nodes, 12 sinks, rate 3) Minimal 1 greedy approach [Fragouli Soljanin 06] Minimal 2 greedy approach [Langberg et al. 05]

Proposed GA

Minimal 1

Minimal 2

0

0

0

Best

0.35

0.90

1.10

Avg

0

0

0

Best

0.25

1.05

0.80

Avg

0

0

0

Best

1.20

1.35

1.85

Avg

0

1

0

Best

1.05

1.85

1.90

Avg

LATA-X ISP 1755 (20,12,4) (40,12,3)

Ratio 1 0.39 1 0.31 1 0.89 0 0.57

Performance of genetic algorithms

Decentralized operation Populations can be managed locally Cross-overs and mutations can be managed locally also Some coordination is required for

Fitness value calculation - feedback can be done efficiently Selection and pairing of chromosomes - can be calculated at the

source and transmitted with the data on renewal of each generation.

1 0 0 1 1 1 0 1 1 0 1 1 0 1

1 0 0 1 1 1

0 1 1 0 1 1 0 1

0 0 1

1 1 1

… 0 1 0

1 0 0

… 1 1

0 1

… 0 0

0 1

… 1 1

0 1

… 1 0

1 1

…

0 0 11 1 1

0 1 01 0 0

1 10 1

0 00 1

1 10 1

1 01 1

< Population >

Interesting problems

Interaction of coding and non-coding nodes: Should they just co-exist or cooperate? What algorithms can solve a joint routing/coding

problem (in effect a constrained multicast network coding problem)?

Coding as a resource: Can we determine how to place our coding

resources in the network? Should we turn coding on as needed?

Network coding – source coding -cooperation confluence Network coding and distributed compression are

intimately linked [Ho et al. 04] – we may envisage Network coding for correlated sources can make use

of naturally occurring correlation Designing sources with correlation rather than

straightforward replication as is done currently in mirrors

Coding and decoding melds erasure coding, multicast coding and compression

Rather than consider only shedding redundancy in networks, network coding points to using it and designing it intelligently

Optimization for multicast network coding

(1,1,0)

(1,0,1)

(1,1,0)

(1,0,1)

(1,1,1)(1,1,0)

(1,0,1)

=

source

sink

(1,1,1)

(1,1,1)

Index on receiversrather than on processes [Lun et al. 04]

Steiner-tree problem can be seen to be this problemwith extra integrality constraints

Joint versus separate coding

for each link (R = 3)

Joint (cost 9) Separate (cost 10.5)

[Lee at al. 07]


Making use of the joint coding: Complexity goes up with the number of sources How much better does this perform than doing Slepian-Wolf first,

followed by routing or network coding? How dependent is the design on knowing actual correlation

parameters? Practical code design for such schemes:

Achievability comes from random code construction, uses minimum-entropy decoding

Can we use the practical techniques that have yielded good results in Slepian-Wolf in this type of network coding?

Generalize mirror site design: Do not copy a whole site, but just certain portions How does this affect the storage in and operation of networks?

Going beyond multicast Can create algebraic setting for linear non-multicast connections [Koetter Medard

02,03] In the non-multicast case, linear codes do not suffice [Dougherty et al. 05] Limited code approaches: ability to use XOR

Opportunistic XORs that are undone immediately (COPE) [Katabi et al. 05, 06]

End-to-end XOR codes on 2 flows [Traskov et al. 06] using cycle approaches These approaches outperform routing by trivially subsuming it Generalizations to codes including more flows, intermediate decoding points or codes

beyond beyond XORs can be envisaged A plethora of elaborations can be developed, leading to increased complexity

with further benefits – trade-off unclear

0200

400600800

100012001400

16001800

1 3 5 7 9 11 13 15 17 19 21

No Coding

Our Scheme Net throughput (KB/s)

Number of flows

a b







0200

400600800

100012001400

16001800

1 3 5 7 9 11 13 15 17 19 21

No Coding


Number of flows

aa b







0200

400600800

100012001400

16001800

1 3 5 7 9 11 13 15 17 19 21

No Coding


Number of flows

a ba b







0200

400600800

100012001400

16001800

1 3 5 7 9 11 13 15 17 19 21

No Coding


Number of flows

a b

a+b

a+b a+b

a b

A principled optimization approach to match or outperform routing An optimization that yields a solution that is no worse

than multicommodity flow The optimization is in effect a relaxation of

multicommodity flow – akin to Steiner tree relaxation for the multicast case

A solution of the problem implies the existence of a network code to accommodate the arbitrary demands – the types of codes subsume routing

All decoding is performed at the receivers We can provide an optimization, with a linear code

construction, that is guaranteed to perform as well as routing [Lun et al. 04]

Optimization gives a set partition of {1, . . . ,M} that represents the sources that canbe mixed (combined linearly) on links going into j

Demands of{1, . . . ,M} at t

Optimization for arbitrary demands with decoding at receivers

Coding and optimization

Sinks that receive a source process in C by way of link (j, i) either receive all the source processes in C or none at all

Hence source processes in C can be mixed on link (j, i) as the sinks that receive the mixture will also receive the source processes (or mixtures thereof) necessary for decoding

We step through the nodes in topological order, examining the outgoing links and defining global coding vectors on them (akin to [Jaggi et al. 03])

We can build the code over an ever-expanding front We can go to coding over time by considering several flows for

the different times – we let the coding delay be arbitrarily large The optimization and the coding are done separately as for

the multicast case, but the coding is not distributed

Fix the code approach – conflict hypergraph There may occasions when we are not willing to go to infinite code lengths, or the

types of codes may be pre-determined in our network, with different codes at different nodes

In that case, we can adopt a conflict hypergraph representation of the effects of coding and allowable rate regions together

Recent development for considering intrinsic multicast in switches [Sundarajan et al. 04] and special fabrics [Caramanis et al. 04]

Provides a systematic approach of representing the capacity region of a coded system for arbitrary codes

Vertices: Define one vertex for each possible “composition of information” on every link The composition of information on a link is the net transfer function from the source

messages to the symbol sent on the link Edges:

In a valid code, more than one vertex cannot be chosen corresponding to each link If the composition on an outgoing link at a node is incompatible with a set of incoming input

compositions, then the corresponding vertices are connected by a hyperedge Natural extension of switching approaches in networks


Design of codes: How far should we go? What are the advantages and disadvantages of fixing the

lengths and fields ahead of time? Should be looking at non-linear codes? Can we find some distributed approaches?

Performance evaluation: Can we use properties of certain conflict graphs to obtain

capacity regions? Can we generalize the optimization approach, for instance

when certain nodes can do intermediate decoding?

Interesting problems

some interesting directions in network coding

Technology

network coding need

b achievable rate

node b finite

feedback delay

network usage

block erasure coding

number of feedback packets

minimal set of coding