data persistence in sensor networks: towards optimal encoding for data recovery in partial network...
TRANSCRIPT
![Page 1: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/1.jpg)
Data Persistence in Sensor Networks: Towards Optimal
Encoding for Data Recovery in Partial Network Failures
Abhinav Kamra, Jon Feldman, Vishal Misra and Dan RubensteinDNA Research Group, Columbia University
![Page 2: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/2.jpg)
Motivation and Model
Typical Scenario of Sensor Networks Large number of nodes deployed to
``sense'' environmentData collected periodically
pulled/pushed through a sink/gateway node
Nodes prone to failure (disaster, battery life, targeted attack)
Want data to survive individual node failures``Data Persistence''
![Page 3: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/3.jpg)
OverviewErasure codes
LT-CodesSoliton distribution
Coding for failure-prone sensor networksMajor resultsA brief sketch of proofsA case study of failure-prone sensor
networks
![Page 4: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/4.jpg)
Erasure Codes
Message
Encoding
Received
Message
Encoding Algorithm
Decoding Algorithm
Transmission
n
cn
n≥
n
![Page 5: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/5.jpg)
Luby Transform Codes
Simple Linear CodesImprovement over “Tornado codes”Rateless Codes
![Page 6: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/6.jpg)
Erasure Codes: LT-Codes
b1 b2 b3 b4 b5F=
n=5 input blocks
![Page 7: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/7.jpg)
LT-Codes: Encoding
b1 b2 b3 b4 b5
c1
1. Pick degree d1 from a pre-specified distribution. (d1=2)
2. Select d1 input blocks uniformly at random. (Pick b1 and b4 )
3. Compute their sum (XOR).
4. Output sum, block IDs
E(F)=
F=
![Page 8: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/8.jpg)
LT-Codes: Encoding
E(F)=
b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7
L
F=
![Page 9: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/9.jpg)
LT-Codes: Decoding
b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7
F= b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7
F= b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7
F= b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7
335445555ccbccbccb←−←−←−
F= b1 b2 b3 b4 b5
c1 c2 c3 c4 c5 c6 c7E(F)=
F=
b5 b5 b5
b1 b2 b3 b4 b5
c1 c2 c3 c5 c6 c7
F=
b5c4b5 b5
b1 b2 b3 b4 b5
c1 c2 c3 c5 c6 c7
F=
b5c4b5 b5
b1 b2 b3 b4 b5
c1 c2 c3 c5 c6 c7
F=
b5c4b5 b5
b1 b2 b3 b4 b5
c1 c2 c3 c5 c6 c7
F=
b5c4b5 b5b2b2
b1 b2 b3 b4 b5
c1 c2 c3 c5 c6 c7
F=
b5c4b5 b5b2b2
![Page 10: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/10.jpg)
Degree Distribution for LT-Codes Soliton Distribution:
Avg degree H(N) ~ ln(N) In expectation: Exactly one degree 1 symbol in
each round of decoding Distribution very fragile in practice
Niii
N
i <<−
=
=
1 for )1(
1
11
π
π
![Page 11: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/11.jpg)
Failure-prone Sensor NetworksAll earlier works:
How many encoded symbols needed to recover all original symbols (all or nothing decoding)
Failure-prone networks:How many original symbols can be
recovered from given surviving encoded symbols
![Page 12: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/12.jpg)
Iterative Decoder
x1 x3
x5 x2
x1 x3
x4
x3
Received Symbols
Recovered Symbols
• 5 original symbols x1 … x5
• 4 encoded symbols received
• Each encoded symbol is XOR of component original symbols
x3
x1
x4
![Page 13: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/13.jpg)
Sensor Network Model
Encoded Symbols remaining: kWant to maximize “r”, the recovered
original data symbolsNo idea apriori what k will be
![Page 14: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/14.jpg)
Coding is bad, for small kN original symbolsk encoded symbols receivedIf k ≤ 0.75N, no coding required
N = 128
![Page 15: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/15.jpg)
Proof SketchTheorem: To recover first N/2 symbols, it is best to not do any
encodingProof:
1. Let C(i, j) = Expected symbols recovered from i degree 1 and j symbols of degree 2 or more.
2. C(i, j) ≤ C(i+1, j-1) if C(i, j) ≤ N/2a. Sort given symbols in decoding orderb. All degree 1 symbols will be decoded before other symbolsc. Last symbol in decoded order will be of degree > 1 (see b.)d. Replace this symbol by a random degree 1 symbole. New degree 1 symbol more likely to be useful
3. Hence, more degree 1 symbols => Better output4. No coding is best to recover any first N/2 symbols5. All degree 1 => Coupon Collector’s => ≈ 3N/4 symbols to
recover N/2 distinct symbols
![Page 16: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/16.jpg)
Ideal Degree Distribution
Theorem: To recover r data units such that
r < jN/(j+1), the optimal degree distribution has symbols of degree j or less only.
![Page 17: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/17.jpg)
Lower degree are better for small kIf k ≤ kj, use symbols of up to degree j
So, use kj – kj-1 degree j symbols in close to optimal distribution
N = 128
![Page 18: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/18.jpg)
Case Study: Single-sink Sensor Network
Storage
1
2
4
3
Sink
Sensor nodenodes exchange symbols
nodes 2 and 3 transfer new symbols to the sink
![Page 19: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/19.jpg)
Case Study: Single-sink Sensor NetworkNetwork prone to failureNodes store unencoded symbols at first
and higher degrees with timeSink receives low degree symbols first and
higher degree as time goes on
1
2
4
3
Sink
![Page 20: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/20.jpg)
Distributed SimulationClique TopologyN = 128 nodes in a clique topologySink receives one symbol per unit time
![Page 21: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/21.jpg)
Distributed SimulationChain Topology
N = 128 nodes in a chain topology
1 2 3 … N
![Page 22: Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649d0a5503460f949dd0bc/html5/thumbnails/22.jpg)
Related WorkBulk Data Distribution: Coding is
usefulTornado (Efficient Erasure Correcting Codes by M. Luby et. al.,
IEEE Transactions on Information Theory, vo. 47, no. 2, 2001)
LT-Codes (LT Codes by M. Luby, FOCS 2002)
Reliable Storage in Sensor NetworksDecentralized erasure code (Ubiquitous Access to
Distributed Data in Large-Scale Sensor Networks through Decentralized Erasure Codes by A. Dimakis et. al., IPSN 2005)
Random Linear Coding (“How Good is Random Linear Coding Based Distributed Networked Storage?” by M. Medard et. al., NetCod 2005)