on fault tolerance in wireless ad hoc networks seth gilbert nancy lynch celebration, 2008
Post on 20-Dec-2015
214 views
TRANSCRIPT
On Fault Tolerance in Wireless Ad Hoc
Networks
Seth Gilbert
Nancy Lynch Celebration, 2008
Nancy Lynch
1994Late 1980’s?? 1997 2002-2008
Through the years…
1980 1984 19881992 1996
2000 20042008
FLP: Impossibility of distributed consensus with one faulty process
DLS: Consensus in the Presence of Partial Synchrony
LT: An Introduction to Input / Output Automata
Fault tolerance
Replication
Consiste
ncy
Formal Methods
Simulati
on
Relation
s,
Invarian
t-based
Argument
s
Timing
Increasingly complex, increasingly
dyamic:
• Group communication / membership
• Publish / Subscribe
• Peer-to-peer systems
• Wireless ad hoc networks
The Virtual Infrastructure Project
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
The Virtual Infrastructure Project
Papers:
GeoQuorums: Implementing Atomic Memory in Mobile Ad Hoc Networks, DGLSW, DISC’03, DC’05
Virtual Mobile Nodes for Mobile Ad Hoc Networks, DGLSSW, DISC’03
Consensus and Collision Detectors in Wireless Ad Hoc Networks, CDGNN, PODC’05, DC’08
Timed Virtual Stationary Automata for Mobile Networks, DGLLN, Allerton’05, OPODIS’05
Autonomous Virtual Mobile Nodes, DGSSW, DIALM-POMC’05
A Middleware Framework for Robust Applications in Wireless Ad Hoc Networks, CDGN, Allerton’05
Reconciling the theory and practice of unreliable wireless broadcast, CDGLNN, ADSN’05
Self-Stabilizing Mobile Node Location Management and Message Routing, DLLN, SSS’05
Motion Coordination Using Virtual Nodes, LMN, CDC’05
The Virtual Node Layer: A Programming Abstraction for Wireless Sensor Networks, BGLNNS, WWWSNA’07
A Virtual Node-Based Tracking Algorithm for Mobile Networks, NL, ICDCS’07
Self-stabilization and Virtual Node Layer Emulations, NL, SSS’07
Secret Swarm Unit: Reactive k-Secret Sharing, DLY, IndoCrypt’07
Virtual Infrastructure for Collision-Prone Wireless Networks, CGL, PODC’08
Theses:
Virtual Infrastructure for Wireless Ad Hoc Networks, G, PhD 2007
Air Traffic Control Using Virtual Stationary Automata, B, MEng 2007
Simulation and Evaluation of the Reactive Virtual Node Layer, S, MEng 2008
Virtual Stationary Timed Automata for Mobile Networks, N, PhD 2008
In Progress:
Self-Stabilizing Robot Formations over Unreliable Networks, GLMN
Using Virtual Infrastructure to Adapt Wireline Protocols to MANET, W
Virtual Infrastructure Routing for Mobile Ad Hoc Networks, DN
Scenarios:•Sensor networks•Social networks•Coordination
Wireless Ad Hoc Networks
Scenarios:•Sensor networks•Social networks•Coordinated
applications
Wireless Ad Hoc Networks
— environmental monitoring
— intrusion detection
— border monitoring— fire detection
Scenarios:•Sensor networks•Social networks•Coordinated
applications
Wireless Ad Hoc Networks
— messaging— conferences / events
— HikingNet— TrafficNet
Scenarios:•Sensor networks•Social networks•Coordination
Wireless Ad Hoc Networks
emergency response & military
— firefighting— police response— terrorism
Scenarios:•Sensor networks•Social networks•Coordination
Wireless Ad Hoc Networks
Unreliable communication
Unknown availability
Wireless ad hoc networks are really hard to use.
NoiseCollisions
Dynamic
Unknown participants
Unknown topologyFault
prone
Lost
Messages
Fixed Infrastructure
Deploy:— Base stations— Cell towers— Servers
Problems:— Too expensive— Not feasible
Virtual InfrastructureUnreliable
ReliableAd hoc Fixed
net
Network Layers
Service Service
Middleware
Wireless Ad Hoc Network
Application
Network Layers
Routing Tracking
Virtual
Infrastructure
Wireless Ad Hoc Network
Application
Building Virtual Infrastructure
Basic idea: replicated state machine
Building Virtual Infrastructure
Basic idea: replicated state machine
1. Each participant is a replica.
2. Replicas execute a consistency protocol
3. Leader / backup
4. Leader sends & receives messages for the virtual node
Today’s Questions
1. What is virtual infrastructure?
2. What can you do with it?—Dynamic distributed coordination.
—Air traffic control
3. Does it really work?—Two simulation studies: routing and
address allocation.
Dynamic Distributed Coordination
Challenging problem:
o Highly dynamic environment
o Unreliable network
o Safety-critical applications
Ideal for Virtual Infrastructure solution:
o Static overlay
o Simpler, verifiable algorithms
o Fate-sharing
Dynamic Distributed Coordination
Note:• Number of (non-failed) robots unknown.• Location of other robots unknown.• Pattern may change over time.
Dynamic Distributed Coordination
In each round:
1.All robots stop.
2.All robots send location info.
3.Coordinators exchange info.
In each round:
4.Coordinators calculate.
5.Coordinators send out targets.
6.Robots move to target.
Dynamic Distributed Coordination
Rule 1: If only 1 robot, keep it.
Calculating new targets
Rule 2: If not on the curve and no neighbors on the curve: distribute evenly all but one.
Dynamic Distributed Coordination
Calculating new targets
Rule 3: If not on the curve: distribute among less populated neighbors on the curve.
Dynamic Distributed Coordination
Calculating new targets
Rule 4: If on the curve: distribute among less dense neighbors on the curve.
Dynamic Distributed Coordination
Calculating new targets
Rule 4: If on the curve: distribute among less dense neighbors on the curve.
Dynamic Distributed Coordination
Calculating new targets
Rule 5: Distribute robots evenly on the curve in each region.
Dynamic Distributed Coordination
Calculating new targets
Dynamic Distributed Coordination
Step 1: Eventually, robots cease moving from regions “off the curve” to regions “on the curve”.
Step 2: If neighbor g is the most dense neighbor of u after time t, then u is less dense than g after time t+1.
Step 3: Eventually, robots remain always in the same region.
Correctness
Dynamic Distributed CoordinationSelf-stabilization
What happens when something goes wrong?
Too many lost messages
Too much churn
INCONSISTENT REPLICAS
Option 1: Design for the very, very worst case.
Option 2: Design a system that can recover from faults.
Emulating Virtual Infrastructure
Self-stabilization techniques
Leader Election:
o Heartbeats, timeouts
o Resolve leader competitions
Replica Consistency:
o Leader sends “checksums” of the state.
o If out-of-synch, then re-join.
Building Virtual Infrastructure
Self-stabilization claims
Assume that:
o A is a self-stabilizing algorithm.
o A is designed for the virtual infrastructure abstraction.
o A is executed with the emulator.
o The system begins in an arbitrary (corrupt) state.
Then if the system is eventually well-behaved:
o From some point on, the state of A is as if it had really executed on a fixed infrastructure.
Dynamic Distributed Coordination
Summary
Coordination algorithm is self-stabilizing.
o In each round, all state is recalculated.
o Underlying virtual infrastructure emulation is self-stabilizing.
Implications:o Converges to changing curve.
o Recovers from network instability, lost messages, etc.
Dynamic Distributed CoordinationAdditional comments
Tina Nol
te
Virtual
Stationa
ry Timed
Automat
a for
Mobile N
etworks
PhD 2008
Dynamic Distributed CoordinationAir traffic control
Free Flight
o No flight plan, no control towers!
o Each pilot chooses a route independently.
o More efficient:
—Adapt to wind currents.—Avoid turbulence / bad weather.
Dynamic Distributed CoordinationAir traffic control
Goal: Free Flight
o Each pilot chooses a route independently.
o More efficient:
—Adapt to wind currents.—Avoid turbulence / bad weather.
In the USA, minimum separation: 3 miles lateral distance OR 1000 feet altitude
Dynamic Distributed CoordinationAdditional comments
Matthew
D. Brown
Air Traf
fic Cont
rol Usin
g Virtua
l
Stationa
ry Autom
ata
MEng, 20
08
Today’s Questions
1. What is virtual infrastructure?
2. What can you do with it?—Dynamic distributed coordination.
3. Does it really work?—Two simulation studies.
Simulating Virtual Infrastructure
Study #1 — Routing / Geocast
— Custom-built simulator (python)
— Simple communication model
Study #2— Address allocation (i.e., DHCP)
— ns2 simulator
— 802.11 MAC layer
GeoCast
Location-based routing
Source Destination
GeoCast
Location-based routing
Source Destination
Location Service
Store current location at home
Target
geocast
geocast
hash(id, 1)
hash(id, 2)
Location Service
Where are you?
Target
geocast hash(id, 2)
Source
hash(id, 1)
Routing
Point-to-point communication
Two step process: 1.Lookup destination location.
2.Geocast message to destination’s region.
400 m
400 m
250 m
Simulation Setup
Number of devices: • 25 / 50 / 100
Velocity: • 0-20 meters / second
Mobility model:• Random waypoint• Pause time: 100-900s
Simulation time: • 1000 seconds
Basic settings
400 m
400 m
250 m
Simulation Setup
GeoCast:• 10 send/receive pairs
• 1 msg every 5 secs
Routing• 10 send/receive pairs
• 1 msg every 0.5 secs
• 15 second simulation
Application settings
Mobility and DensityP
erce
nt o
f T
ime
Non
-Fai
led
Pause Time
100%
80%
60%
20%
40%
200 400 600 800
25 devices
100 devices
50 devices
When density is sufficient, virtual nodes work.
Leadership ChangesLe
ader
ship
Cha
nges
pe
r R
egio
n10
Pause Time
8
6
2
4
200 400 600 800
100 devices
There is continuous turn-over in the leader.
Message OverheadM
essa
ges
per
Reg
ion
per
seco
nd
Pause Time
0.5
0.4
0.3
0.05
0.1
0.01
200 400 600 800
Heartbeat
JoinLeader
Most overhead is heartbeats. (Overhead is negligible.)
Geocast Latency Overhead
VN-GeoCast is 2-3 times slower than simple GeoCast.
Late
ncy
(in s
econ
ds) 0.5
Pause Time
0.4
0.3
0.1
0.2
200 400 600 800
100 devices
simple Geocast
Routing
79%0.46
seconds0.58
seconds
Delivery Rate
Median Latency
Average Latency
End-to-end performance
Each message requires 3 GeoCast messages.
** devices=50, pausetime=400
Simulation Summary
Virtual nodes are stable if:—sufficient density (e.g.,
4/region), OR—low-enough churn
Message overhead: negligible.
GeoCast latency overhead: factor of 2.
Routing: relatively slow.
Simulation SummaryAdditional comments
Mike Spi
ndel
Simulati
on and E
valuatio
n of
th
e Reacti
ve Virtu
al Node
Layer
MEng 200
8
Simulating Virtual Infrastructure
Study #1 — Routing / Geocast
— Custom-built simulator (python)
— Simple communication model
Study #2— Address allocation (i.e., DHCP)
— ns2 simulator
— 802.11 MAC layer
— Mobile devices join and leave.
— Each device needs an address.
— Addresses should be assigned dynamically.
— Addresses should be unique.
Basic problem
Address Allocation
Challenges: Highly dynamic. No central authority. Unreliable network. Limited address pool.
Simple Scheme Each region is allocated a cache of addresses.
Basic protocol: Client send REQUEST Server reply OFFER Client send ACQUIRE Server reply ACK
Renew protocol: Client send RENEW Server reply RACK
Message forwarding…
REQUEST
ACQUIRE
RENEW
RENEW
OFFER
ACK
RACK
RACK
Virtual Node Client
Number of devices: • 160MAC Layer: • 802.11• Models collisionsMobility model:• Random waypointSimulation time: • 40000 seconds
700 m
700 m
250 m
Simulation Setup
Basic settings
Number of addresses:
30 per regionLease time:
400 secondsForwarding limit:
2 hop - REQUEST2 hop - RACKVarying - RENEW
700 m
700 m
250 m
Simulation Setup
Application settings
Simulation Setup
Simulation settings
Very Slow
SlowMedium Slow
Medium Fast
Fast
Min. Speed (m/s) 0.365 0.73 1.46 2.92 7.3
Max. Speed (m/s) 1.48 2.92 5.84 11.68 29.2
Average Pause Time (s)
4400 2200 1100 550 220
Average Cross Time (s)
82.20 41.10 20.55 10.27 4.11
Message Overhead
Messages per 400 secs
Percent
Heartbeats 360 76
Leader Request 24 5
Leader Reply 50 11
Synch-Request 20 4
Synch-Reply 20 4
Total Message Overhead
474
Maximum observed:
Less than 2-4.5kbps
Message Overhead
0
1000
2000
3000
4000
5000
6000
very slow slow medium slow medium fast fast
Other emulator messages per region
LeaderRequest msgs/region
LeaderReply msgs/region
SYN_REQUEST msgs/region
SYN_ACK msgs/region
Different speeds
Message Overhead
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
40 60 80 100 120total number of nodes
other emulator messages per node
LeaderRequest msgs per region
LeaderReply msgs per region
SYN_REQUEST msgs per region
SYN_ACK msgs per region
Different densities
Protocol Performance
messages per region
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
very slow slow medium slow medium fast fast
allocations per client
messages per region
Different speeds
delay per renewal
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
very slow slow medium slow medium fast fast
renewal delay
delay per renewal
Renewal cost
Protocol Performance
Simulation Summary
Message overhead: still negligible.
— Even with collisions…— Backoff…— Bigger simulations…
Simple address allocation scheme:
— Reasonably efficient…— Scales well…
Simulation SummaryAdditional comments
Jiang Wu
Using Vi
rtual In
frastruc
ture to
Adapt Wi
relines
Protcols
to
MANET
Summary
What is virtual infrastructure? Dynamic distributed coordination
Robotic motion coordination
Self-stabilization
(Preliminary) simulation results.
The Virtual Infrastructure Project
Distributed Algorithms
Focus on fault-tolerance
— Replication
— Consistency
— Agreement
Design principles
— Abstraction / layered design
— IOA / TIOA formalism
Classical techniques, modern networks
Seth
Gilbert
George Varghese
Boaz Patt-Shamir
Jennifer Welch
Brian Coan Kenneth Goldman
Shinya Umeno
Alex Cornejo
Mark Tuttle
Joshua Tauber
Eugene Stark
Rainer Gawlick
Alan Fekete
Victor Luchancgo
Roberto Segala
Rui FanTina NolteSayan Mitra
Calvin Newport
Carl Lividas
Jim Burns
Roger KhazanRoberto DePriscoCongratulations, Nancy, and thank you!!
The End