1 routing with a clue anat bremler-barr joint work with yehuda afek & sariel har-peled tel-aviv...
TRANSCRIPT
1
Routing with a clueRouting with a clue
Anat Bremler-Barr
Joint work with
Yehuda Afek & Sariel Har-Peled
Tel-Aviv University
2
IP LookupIP Lookup
• IP lookup - given an IP address, determine the next hop for reaching that destination
• Fast Address lookup key component for high performance routers
1011001101011011001111110101
Destination Address
Prefix NxtHop
* 400* 12011101110* 301111* 71* 210000001* 3 10110* 3101111* 510110011 * 210110011010* 4
Forwarding Table
3
Best Matching Prefix (BMP)
• Routers today use kind of trie data structure (Patricia)
• BMP lookup is expensive – Count in number of memory reference
1
1
0
Prefix NxtHop
* 100* 2000* 30001* 401* 2 1* 1110* 3Forwarding tableForwarding tabletrietrie
Example: The BMP of 0000 is 000*
0 1
0
0 1
4
• Faster data structureFaster data structure– [WVTP 97] log W (W=32)– [DBCP 97] [ DKVZ 99]
[CM99] [NK98] compression– [LSV 98] binary and 6-way search– [SV 98] version on tries – [ CP99] caching
• Hardware Hardware – CAMs [MF 93]– Pipeline [GLM 98], [HZPS99]
• Reducing the need of IP lookupReducing the need of IP lookup – Tag-Switching– CSR – ARIS– IP-Switching– threaded indices– MPLS
Three Approaches
5
Our ApproachRouting with a Clue:Distributed IP-lookup
Clue: Telling a router where its upstream neighbor ended the IP lookup
R1 R2 R3
0 0
1
0
1
111
R4
0
11
1
Many times the BMP’s are identical
6
Distributed IP-lookup
Prefixlength
Work
in Trie
Workwitha clue
32backbone
backbone1
7
Distributed IP lookup, why ?
Each router on the path to the destination performs a similar IP-lookup:
• Same destination address• Similar forwarding tables at
neighboring routers
010,00020,00030,00040,00050,00060,00070,000
AT&T1 AT&T2
AT&T1 AT&T2Equal 23,382 23,382Diff 32 36,093Total 23,414 60,475
Numberof
Prefixes
8
010,00020,00030,00040,00050,00060,00070,000
ISP-B1 ISP-B2
ISP-B1 ISP-B2Equal 55,540 55,540Diff 494 419Total 56,034 55,959
Forwarding tables are similar
Numberof
Prefixes
9
Even if the routers are not neighbors
010,00020,00030,00040,00050,00060,00070,000
MaeEast MaeWest
MaeEast MaeWestEqual 23,382 23,382Diff 18,868 741Total 42,250 24,123
Numberof
Prefixes
10
Outline:
• Simple method
• Clue table and fields
• Advanced method
• Experimental results
• MPLS improvements
• Conclusions
11
In many cases the Clue directly determines the best matching
prefix:
When the clue vertex (in the trie):
(a) has no descendants,
(b) does not exist
Trie of R1 Trie of R2
(a)(a)
(b)(b)
R1 R2
12
In other cases the clue is -
a good point to start IP lookup
•Search from the vertex that corresponds to the clue
•If search fails, then forward according to the BMP of the clue string
Trie of R1 Trie of R2
0 0
1
1
1
0 0
R1 R2
Dest. Add. :001101010...
13
Clue Table
R1 R2 R3
Clue table
Cluetable
Cluestable
14
The clue table
• An entry for each clue a router may receive from its neighbors
• Two fields, that are pre-computed: – Ptr Ptr - pointer to the vertex that
corresponds to the clue– FDFD - Final Decision (BMP, next hop):
• When clue fixes the BMP
• When the search from the clue fails
(two cases are distinguished by Ptr value)
Clue table size: Estimation 500KB
15
Example
R1 R2
Prefix NxtHop
*00*0001*01*1*
Prefix NxtHop*00*0001*00010*0101*011*1*
Clue FD Ptr
*00* 00*0001*01* * 1* 1*
0
0
0
0
0
0
1
1
1
1
1
1
1
Forwarding table
Clue table
Forwarding table
0
1
0
16
Combining the clue table of several neighbors
• A clue table at each interface
• If not, combine them into one
table
17
Clue encoding in the header
• 5 bits (IPv4) indicating the prefix of
the dest. address which is the clue
Example:
Dest addressDest address:10110011111101011011001111110101
Clue is codedClue is coded:10001
The clueThe clue: 10110011111101011
• Need a hash-table
18
To avoid hash-table
• Replace clue with <clue,index> pair: index is into the clue table (16 bits)
• Learning on the fly
Dest addressDest address:10110011111101011011001111110101
Clue is codedClue is coded:10001
The clueThe clue: 10110011111101011
Index:Index: 1110110101010110
1011…01 1110110101010110
Clue Ptr FD
Similar to the idea [CV96]
19
Example
R1 R2
Prefix NxtHop
*00*0001*01*1*
Prefix NxtHop*00*0001*00010*0101*011*1*
Clue FD Ptr
*00* 00*0001*01* * 1* 1*
0
0
0
0
0
0
1
1
1
1
1
1
1
Forwarding table
Clue table
Forwarding table
0
1
0
20
Advanced methodor
get more from a clue
Trie at R1 Trie at R2
clue clue
BMP of clue
R1 R2
21
Advanced Method - Advanced Method - Claim:Claim:
No search is necessary, if on any path going down from the “clue” at the trie of R2, a prefix of R1 is encountered, before or at the same time that a prefix of R2 is encountered
If Claim holds on the received clue,
lookup costs one memory reference
Otherwise Problematic clues
need further search
22
Problematic Clues
0
5000
10000
15000
20000
25000
AT&T1->AT&T2 575 22,839
Problematic Non-Prob
0
10000
20000
30000
40000
50000
60000
70000
AT&T2->AT&T1 52 60,423
Problematic Non-Prob
2.45%
0.0008%
AT&T1->AT&T2AT&T1->AT&T2
AT&T2->AT&T1AT&T2->AT&T1
Numberof
Prefixes
Numberof
Prefixes
23
Problematic Clues
0
10000
20000
30000
40000
50000
60000
ISP-B1->ISP-B2 66 55,968
Problematic Non-Prob
0
10000
20000
30000
40000
50000
60000
70000
ISP-B2->ISP-B1 38 60,437
Problematic Non-Prob
0.0011%
0.0006%
ISP-B1->ISP-B2ISP-B1->ISP-B2
ISP-B2->ISP-B1ISP-B2->ISP-B1
Numberof
Prefixes
Numberof
Prefixes
24
Combining the clue table of several neighbors
• A clue table at each interface• Advanced method - behavior depends on
which neighbor the clue came from:
Three solutions:– Intersection
• No search - if all neighbors agree that no further search is necessary
– Bit Map• Join the clues of neighbors in one table
• Add a Bit Map to each clue entry:
The j bit is set - further search is necessary if clue came from the j-th neighbor
– Sub-tables:• One table for the intersection: the clues that
behave the same,
• Specific table per neighbors for the other clue
25
Method of continued search
Trie, Patricia[S93,LS88],
binary, 6-way[LSV98], Log W [WVTP97]
Clue from R1
Trie at R2
Prefix of R1
Prefix of R2
26
Experimental results
AT&T1 AT&T2
0
5
10
15
20
25
Trie Patricia Binary 6-Way LogW
Common
Simple
advanced
Trie Patricia Binary 6-Way LogWCommon 23.5899 21 17 7 3.448Simple 2.0801 2.0565 2.1189 2.0456 3.0446advanced 1.0552 1.0442 1.049 1.019 1.0619
Mem
ref’s
27
Experimental results
AT&T2 AT&T1
0
5
10
15
20
25
Trie Patricia Binary 6-Way LogW
Common
Simple
Advanced
Trie Patricia Binary 6-Way LogWCommon 23.241 19 16 6 3.6339Simple 2.0583 2.0396 2.0947 2.0367 3.0566Advanced 1.011 1.0011 1.0011 1.004 1.0029
Mem
ref’s
28
Multi Protocol Label Switching
“Route once switch many”
29
CLUE & MPLS
• MPLS still needs an IP-lookup– Data driven: at set up
– Control driven: at aggregation point
The label here is the clue
R1 R2 R3 R4
R5
R6
10.0.0/24 10.0.0/24 10.0.0/24 10.0.0/24
10.0.0.1/25
10.0.0.0/25
30
• Simplicity
• 10 times faster
• Switching speed, without major changes (but MPLS w/o TE)
• Heterogeneous IP networks
• No coordination
• No setup time
• Helps MPLS too
31
Future Work
• Clue with multidimensional classification
• Further experiments