encoding short ranges in tcam without expansion: efficient … · 2016-06-26 · –short tcp/udp...
TRANSCRIPT
Encoding Short Ranges in TCAM Without Expansion: Efficient Algorithm and Applications
Yotam HarcholThe Hebrew University of Jerusalem, Israel
Joint work with:
This research was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant agreement no 259085.
Anat Bremler-Barr David Hay Yacov Hel-OrIDC Herzliya,
IsraelHebrew University,
IsraelIDC Herzliya,
Israel
To appear in ACM SPAA 2016
TCAM – Ternary Content Addressable Memory
• TCAM is an associative memory module– Useful for parallel multidimensional prefix lookup
• Each entry may consist of 0, 1, and * bits
• Widely used in high-speed networking devices
2
P1
P2
Drop
P3
SRAM
011101001110001011
00110****110001***0111*****1100010**00*******110******01*******11000****
TCAM
1
0
1
2
3
4
5
0
1
2
3
4
5
Example: 2-dimensional IP lookup:
Source IP Dest. IP
What about ranges?(e.g., port numbers)
Range Encoding in Binary Representation
• Can ternary encode only certain intervals
• Or we need more than one entry (expansion)
• Exponential space blow-up when using multiple range fields3
00000
10001
20010
30011
40100
50101
60110
70111
81000
91001
101010
111011
121100
131101
141110
151111
01**
0***
00**
????
011*100*
????
0101011*100*1010
Encoding Short Ranges
• Ranges of arbitrary length – infeasible without expansion
• Ranges with bounded length – feasible!
• Useful for:– Short TCP/UDP ranges (>60% of the ranges in real life)
– QoS mechanisms (IP ToS/DSCP)
– Packet size classification (by categories)
– Timestamps and counters
– IP spoofing detection (using IP TTL)
– SDX – AS numbering
• Also useful for applications outside the networking domain4
Range Encoding on TCAM: Two Approaches
Lower bound for w-bits field without expansion: 2w-1 bits per range [Lakshminarayanan et al., 2005]
5
fenc([x, y]) = *1**1001**10
**1*10**101*
*1**100***1*
DBfenc( , [x, y]) = *1**100*10**1*10*01*
Database-independent encoding:
Database-dependent encoding:
Easy to update
Compact codes
Both schemes require expansion for a feasible code length
[Lakshminarayanan et al., 2005], [Bremler Barr et al., 2007]
[Liu, 2002], [Van Lunteren & Engbersen, 2003], [Chang & Su, 2007] , [Che et al., 2008], [Bremler Barr et al., 2009], [Rottenstreich & Keslassy, 2010], [Rottenstreich et al., 2013], [Kogan et al., 2014]
RENÉ – Range Encoding with No Expansion
6
fenc([x, y]) = *1**1001**10
Database independent - easy to update
No row expansion
Near-optimal TCAM space usage
Encoding function for short ranges:
Useful for packet classification applications
Useful for high dimensional nearest neighbor search
Binary-Reflected Gray Code (BRGC)
Hamming distance between each two adjacent points = 1
7
0
0
0
0
0
0 0 0
1
1
1 1
1 1 1 1
00000
10001
20011
30010
40110
50111
60101
70100
1
0 0
0
1 1
81100
91101
101111
111110
121010
131011
141001
151000
0 0 0 01 1 1 1
0
0
0
0
0 0 0
1
1 1
1 1 1 1
0000
1001
2011
3010
4110
5111
6101
7100
0
0 0
1
1 1
000
101
211
310
0 1
0 1
Build recursively by reflecting binary code:
Range Encoding withBinary-Reflected Gray Code
• With ternary BRGC we can encode some of the ranges of length h=2k (k∈N)
• What about other (red) ranges?
8
00**
00000
1
0001
20011
3
0010
40110
5
0111
60101
7
0100
81100
9
1101
101111
11
1110
121010
13
1011
141001
15
1000
01** 11** 10**
0*1* *10* 1*1* *00**00**
Layers
• Layer: a set of disjoint consecutive ranges
• Two (blue) layers can be encoded with a BRGC-based ternary word
– Other (red) layers need more bits
9
00** 01** 11** 10**
0*1* *10* 1*1* *00**00**
00000
1
0001
20011
3
0010
40110
5
0111
60101
7
0100
81100
9
1101
101111
11
1110
121010
13
1011
141001
15
1000
Cover Ranges
• Cover range of R: the smallest blue range that fully contains R
– In red layers: No two consecutive ranges in the same layer are fully contained in a cover range
Given the cover range, one bit is enough to differentiate ranges in the same layer
10
R
0***cover(R)
0 1 0
00000
1
0001
20011
3
0010
40110
5
0111
60101
7
0100
81100
9
1101
101111
11
1110
121010
13
1011
141001
15
1000
Single Bit Range Index
• Red ranges require different encoding
• Add one bit for each red layer– Bit value alternates between ranges in the same layer
– Bits of other red layers are set to *
– For blue layers, these bits are always *
11
00000
1
0001
20011
3
0010
40110
5
0111
60101
7
0100
81100
9
1101
101111
11
1110
121010
13
1011
141001
15
1000
0* 1* 0* 1* 0*
*1 *1*0*0 *0
** ******
** ****** **
0* 1* 0* 1* 0*
*1 *1*0*0 *0
Encoding Scheme for Ranges
• Encoding of a single range of length h=2k is:
• Value encoding is now:
12
Ternary BRGCBit of
L1. . .
Bit ofLh/2-1
w bits h-2 bits
Bit ofLh-1
Bit ofLh/2+1
. . .Blue ranges: BRGC(R)
Red ranges: BRGC(cover(R))
BRGCBit of
L1. . .
Bit ofLh/2-1
w bits h-2 bits
Bit ofLh-1
Bit ofLh/2+1
. . .
Toy Example
13
Optimization
• For ranges of length h=2k
k-1 LSBs of the BRGC part are always *
• Code can be shorten to w-log(h)+h-1 bits
14
00000|00
1
0001|10
20011|10
3
0010|11
40110|11
5
0111|01
60101|01
7
0100|00
81100|00
9
1101|10
101111|10
11
1110|11
121010|11
13
1011|01
141001|01
15
1000|00
00**|** 01**|** 11**|** 10**|**
0***|1* *1**|0* 1***|1* *0**|0**0**|0*
0*1*|** *10*|** 1*1*|** *00*|***00*|**
0***|*1 *1**|*0 1***|*1 *0**|*0*0**|*0
00000|00
1
0001|10
20011|10
3
0010|11
40110|11
5
0111|01
60101|01
7
0100|00
81100|00
9
1101|10
101111|10
11
1110|11
121010|11
13
1011|01
141001|01
15
1000|00
00**|** 01**|** 11**|** 10**|**
0***|1* *1**|0* 1***|1* *0**|0**0**|0*
0*1*|** *10*|** 1*1*|** *00*|***00*|**
0***|*1 *1**|*0 1***|*1 *0**|*0*0**|*0
Encoding Multiple Range Lengths
• For ranges of length 3, intersect two ranges of length 4:
15
00*|**
0000|00
1
000|10
2001|10
3
001|11
4011|11
5
011|01
6010|01
7
010|00
8110|00
9
110|10
10111|10
11
111|11
12101|11
13
101|01
14100|01
15
100|00
01*|** 11*|** 10*|**
0**|1* *1*|0* 1**|1*
0*1|** *10|** 1*1|**
0**|*1 *1*|*0 1**|*1
*0*|0*
*00|**
*0*|*0*0*|*0
*00|**
*0*|0*
5 7
01*|**
*1*|0*
4 7
5 8
01*|** Π *1*|0* = 01*|0*
We present the conjunction operator Π:
For two ranges R1, R2: tcode(v) ≈ tcode(R1) Π tcode(R2) if and only if v∈R1 ∩ R2
(also: tcode(R1) Π tcode(R2) = ⊥ R1 ∩ R2 = ∅)
(⊥ means ‘undefined’, and if ai Π bi = ⊥ then a Π b = ⊥)
Theoretical Bound
16
TCA
M b
its
use
d p
er r
ange
fie
ld(l
og 2
scal
e)
Maximal range length(log2 scale)
For w=16
The Nearest Neighbor Search Problem
• Input:- A set of data points- A query point (or a series of those)
(points are in a discrete space)
• Output:Data point closest to the query point
• The Curse of Dimensionality: Hard problem for high dimensions(even over 10 or so)
17
The Nearest Neighbor Search Problem
• Encode cubes on data points
– Given a query point, single TCAM lookup returns the smallest cube that contains it
– No TCAM entry expansion in high dimension
18
DataQuery
The Nearest Neighbor Search Problem
• Or instead, save TCAM space
– Encode cubes on the query point
– Query TCAM with growing cubes
– Find first data point to match
19
DataQuery
Experiment on a Real TCAM
• Currently - no evaluation board for TCAM
• Instead, we used a commodity network switch with a TCAM– Switch has 48 ports of 1Gbps each (1.5 Million packet per second)
– TCAM is 192 bits wide
20
00*1**001***011***01*1**00*1**00*1**00****0***1*0****1
Src. MAC Dest. MAC Src. IP Dest. IP
OpenFlowflow_mod
Queriesas packets
OpenFlowCounters
d-Dimensional Cube Representation:
Single port:1.5 Million Queries
Per Second!
1st dimension 2nd dimension 3rd dimension 4th dimension 5th dimension 6th dimension 7th dimension
Conclusions
21
Database independent - easy to update
No row expansion
Near-optimal TCAM space usage
Encoding function for short ranges:
Useful for packet classification applications
Useful for high dimensional nearest neighbor search
fenc([x, y]) = *1**1001**10
Questions?
Thank you.
22