comparison based dictionaries: fault tolerance versus i/o efficiency
DESCRIPTION
Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency. Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus. ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures - PowerPoint PPT PresentationTRANSCRIPT
Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency
Gerth Stølting BrodalAllan Grønlund Jørgensen
Thomas Mølhave
University of Aarhus
ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data StructuresUniversity Residential Centre of Bertinoro, Italy, September 30-October 5, 2007
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave2
Binary Searching
Fault tolerance
I/O Efficiency
This talk
Future work
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave3
Search(17)
4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38
17
O(log N) comparisons
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave4
Search(17)
4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38
17?
9
soft memory error
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave5
Faulty-Memory RAM ModelFinocchi and Italiano, STOC’04
Content of memory cells can get corrupted Corrupted and uncorrupted content cannot be distinguished O(1) safe registers Assumption: At most δ corruptions
Example: Sorting requires time Θ(N·log N+δ2)Finocchi, Grandoni, Italiano, ICALP‘06
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave6
Faulty-Memory RAM:Searching
Lower bound
Upper bound
Θ(log N + δ) comparisons
Finocchi, Grandoni, Italiano, ICALP’06
Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave7
Faulty-Memory RAM:Searching
17?
94 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38
Problem? High confidenceLow confidence
Requirement: If there exists an uncorrupted element equal to the search key, we should find such an element
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave8
Faulty-Memory RAM:Searching
When are we done (δ=3)?
Contradiction, i.e. at least one fault
If range contains at least δ+1 and δ+1 then there is at least one uncorrupted and , i.e. x must be contained in the range
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave9
If verification fails → contradiction, i.e. ≥1 memory-fault → ignore 4 last comparisons → backtrack one level of search
Faulty-Memory RAM:Θ(log N + δ) Searching
1 12 233 44 55
Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave10
Faulty-Memory RAM:Θ(log N + δ) Searching
1 12 233 44
Standard binary search + verification steps At most δ verification steps can fail/backtracking Detail: Avoid repeated comparison with the same
(wrong) element by grouping elements into blocks of size O(δ)
Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave11
Faulty-Memory RAM: Reliable Values
Store 2δ+1 copies of value x - at most δ copies uncorrupted x = majority Time O(δ) using two safe registers (candidate and count)
δ=5 y y y x x y x x x y x
Candidate y y y y y y y – x – x
Count 1 2 3 2 1 2 1 0 1 0 1
Boyer and Moore ‘91
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave12
Faulty-Memory RAM:Dynamic Dictionaries
Packed array Reliable pointers and keys Updates O(δ ·log2 N) Searches = fault tolerant O(log N+δ)
2-level buckets of size O(δ·log N) Root: Reliable pointers and keys Bucket search/update amortized O(log
N+δ)
... ...
Θ(δ·log N)elements
Itai, Konheim, Rodeh, 1981
Search and update amortized O(log N+δ)
...
Θ(δ) elements
Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen,
Moruz, Mølhave, ESA’07
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave13
I/O Model
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave14
I/O Model
N = problem size M = memory size B = I/O block size
One I/O moves B consecutive records from to disk Complexity = number of I/Os
CPU External
I/O
MemoryyromeM
Aggarwal and Vitter 1988
B
N
B
NBM /log Example: Sorting requires I/Os
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave15
B-trees
O(logB N)
....
Ω(B)
Search path
Search and update O(logB N)
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave16
Fault-Tolerance
versus
I/O Efficiency
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave17
Lower Bound for Fault-Tolerant External Searching
Adversary argument
If Bε slabs per I/O → factor Bε reduction and B1-ε faults After k I/Os N/(Bε)k–k· B1-ε elements remain
Possible values
1
log1
BNB I/Os required [minimized wrt ε ]
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave18
Randomized Upper Bound for Fault-Tolerant External Searching Sorted array + 2δ identical B-trees
(over N/(2δ) elements, stored in BFS layout) Search: Select random tree for each
node on search path + verification Probability no faults on path:
where Σβi≤δ
Search O(logB N+δ/B) expected
2
1
21
log
1
N
i
iB
....
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave19
Sorted array
+ 2δ/B1-ε identical B-trees of degree Bε
+ B1-ε copies of each key + min/max Search: Verify against min/max in
each step – if fail, backtrack one
level and advance to next copy
Search I/Os
Deterministic Upper Bound for Fault-Tolerant External Searching
BB
NO B
1
log1
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave20
Dynamic Fault-Tolerant External DictionariesStatic structure
+ Packed arrays
+ Buckets of size O(δ ·log3 N)
Deterministic
I/Os search and updates
Randomized
Expected O(logB N+δ/B) I/Os search and updates
1
log1
BNO B
... ...
Static
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave21
Fault-tolerant external memory searching
I/Os
worst-case [minized wrt ε]
Randomized O(logB N+δ/B) I/Os
Conclusion
1
log1
BNB
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave22
Future Work Fault Tolerance versus I/O Efficiency Randomized algorithms:
Memory faults in internal memory?
Sorting:
? ...
BB
N
B
NBM
2
/log
Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave23
THANKS