overview filtering architecturespaul_o/courses/filters_arch.pdf · cambridge university press,...

42
Filtering Architectures Olivier Paul LOR department Overview Introduction Network level Packet Filtering. Packet classification Fragmentation Circuit level Packet Filtering Stateful filtering. TCP stream reassembly. Directing traffic to filters Application level Packet Filtering Protocol parsing. Pattern matching. Encryption. Packet filter examples. A few references …That helped me setting up this course. Design and Performance of the OpenBSD Stateful Packet Filter. D. Hartmeier Freenix 2002. Real Stateful TCP Packet Filtering in IP Filter, Guido van Rooij, Usenix 01, 2001. Algorithms for routing lookups and packet classification, Pankaj Gupta, PhD Thesis, Stanford, 2000. Flexible pattern matching in strings, Navarro, Raffinot, Cambridge University Press, 2002. Handbook of exact string matching algorithms, Charras, Lecroq, King’s college London publications, 2004. PF: http://www.benzedrine.cx/pf.html IPFILTER: http://coombs.anu.edu.au/ipfilter/ NETFILTER: http://www.netfilter.org/ Introduction Application Architectures Network Circuit Network level Filter • Reminder Introduction Application Architectures Network Circuit

Upload: others

Post on 09-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Filtering Architectures

Olivier Paul

LOR department

Overview

• Introduction• Network level Packet Filtering.

– Packet classification– Fragmentation

• Circuit level Packet Filtering– Stateful filtering.– TCP stream reassembly.– Directing traffic to filters

• Application level Packet Filtering– Protocol parsing.– Pattern matching.– Encryption.

• Packet filter examples.

A few references

• …That helped me setting up this course.• Design and Performance of the OpenBSD Stateful Packet

Filter. D. Hartmeier Freenix 2002.• Real Stateful TCP Packet Filtering in IP Filter, Guido van

Rooij, Usenix 01, 2001.• Algorithms for routing lookups and packet classification,

Pankaj Gupta, PhD Thesis, Stanford, 2000.• Flexible pattern matching in strings, Navarro, Raffinot,

Cambridge University Press, 2002.• Handbook of exact string matching algorithms, Charras,

Lecroq, King’s college London publications, 2004.• PF: http://www.benzedrine.cx/pf.html• IPFILTER: http://coombs.anu.edu.au/ipfilter/• NETFILTER: http://www.netfilter.org/

Introduction

ApplicationArchitectures

NetworkCircuit

Network level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Page 2: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Introduction

• What packet filtering does ?– Provide a form of access control:

• a security service used to protect resources against unauthorized use.

– Decide if a “communication” is allowed according to packet content.

– How to relate packet filtering to access control?• Resources implies a form a resources identification

– Network services = resources.

• Unauthorized implies some form of user identification– Network address = users.

• Degree of trust in those addresses can change widely depending on the location of the filtering device.

Introduction

ApplicationArchitectures

NetworkCircuit

Introduction

• Packet filtering cons:– Communication authorization not based on strong authentication

information.• IP address can be changed (User, NAT).

– Little user related information.• One address = many users.

– Little resource related information.• No strong link between service and port number.

• Some port number are attributed dynamically.

• (We’ll see more about this later).

Introduction

ApplicationArchitectures

NetworkCircuit

Introduction

• Packet filtering pros:– Simplicity.

• Easy to manage.– Does not require authentication management or deployment in applications.

– “Single” point of service.

• Widely available (no standard interoperability problem) and cheap.

– Fast.• Authentication: up to 1 Gbit/s.

• Packet filtering: up to 100 Gbit/s.

– Sole solution to some specific problems.• Only solution to discard information before authentication.

Introduction

ApplicationArchitectures

NetworkCircuit

Introduction

• Where can you find packet filters ?– Security devices:

• Firewalls.

• VPN gateway.

– Network devices.• Switches.

• Routers.

• ADSL/Cable modems.

– End systems• Most UNIX's.

• Windows 2000-.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 3: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Network level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Packet Classification

• Motivation– Differentiate traffic treatment

• QoS related functionality – CAR, Marking, …

– Policy based routing.

• Security related functionality – Access control lists, VPN, …

– Accounting and billing• Obtain Traffic sample

– Flow based, Packet based, …

Introduction

ApplicationArchitectures

NetworkCircuit

Classification

• Problem to solve– Using several fields in packet header

0 1 2 3

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

|Version| IHL | Type of Service | Total Length |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Identification |Flags| Fragment Offset |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Time to Live | Protocol | Header Checksum |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Source Address |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Destination Address |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Options | Padding |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

Introduction

ApplicationArchitectures

NetworkCircuit

Classification

• Problem to solve– Using several fields in packet header ...

0 1 2 3

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Source Port | Destination Port |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Sequence Number |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Acknowledgment Number |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Data | | U|A|P|R|S|F | |

| Offset| Reserved | R|C|S|S|Y|I | Window |

| | | G|K|H|T|N|N | |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+-+-+-+-+-+-+

| Checksum | Urgent Pointer |

……………………………

Introduction

ApplicationArchitectures

NetworkCircuit

Page 4: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Classification

• Problem to solve– … and a policy carrying on these fields ...

• Example

– … Decide which action applies to packet.• Action depends on application domain.

– Example. permit, deny, rate limit to 1Mb/s, mark yellow, encrypt, ...

access-list 101 permit ip host 192.168.200.1 host 192.168.204.2access-list 101 permit ip host 192.168.203.1 host 192.168.203.2

Introduction

ApplicationArchitectures

NetworkCircuit

Classification

• In a more formal way:Given a Policy P

P = {R0, … Ri, … Rn}

Ri = (Oi,Ci,Ai),

Ci = {Ci0, … Cij, …, Cip}

Cij = (Lij, Uij)

And given a Packet T

T = {V 0,…Vp}

Find Rk ∈ P such that ∀ e ∈ [0..p],

Lke < Ve < Uke

And there is no k’ such that

Ok’ < Ok

Introduction

ApplicationArchitectures

NetworkCircuit

Classification

find the rectangleto which the point belongs to

• The good news is:– Problem is related to computational geometry problem.

Point location problem (2D version)

Introduction

ApplicationArchitectures

NetworkCircuit

Classification

• The bad news is:– Problem does not have a nice solution in the general case

• Best solution in term of temporal complexity:

TC in O(log n), SC in O(nk)

• Best solution in term of spatial complexity:

SC in O(n), TC in O(log k-1 n)

n k Spatial Complexity Temporal Complexity10 5 10 243

100 5 100 168071000 5 1000 100000

n k Spatial Complexity Temporal Complexity10 5 100000 3

100 5 10. E9 71000 5 1.E15 10

• Overmars, Van der Stappen [Journal of Algorithms 96]

Introduction

ApplicationArchitectures

NetworkCircuit

Page 5: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Classification

• Usual complexities– Firewall:

• 5-6 fields.

• 2k rules.

– Microflows (e.g. RSVP - edge):• 4 fields.

• 128k-1M rules.

– Filtering router:• 5 fields.

• 16-128k rules.

Introduction

ApplicationArchitectures

NetworkCircuit

Address lookup

Dest. Addr Mask Outgoing Port139.100.198.23 /32 5139.100.198.0 /24 4139.100.0.0 /16 1

139.100.198.23 139.100.198/24

139.100/16

51

4

A

B

C

ABC

• Longest match

If a packet satisfies C then it also satisfies B and AWhat we want is to find the strongest condition on the dest. address

Packet, @S, @D

Introduction

ApplicationArchitectures

NetworkCircuit

Address lookup

• Interesting question.– Is the lookup problem a sub-problem of packet classification ?

• Answer is yes.– Address A and Netmask M can be coded as a range

• Cij = (Lij, Uij), Lij = (A AND M), Uij = ((A AND M) AND (NOT M))

– Prefix length can be coded as priority.• Mask(Ai) > Mask(Aj) => O(Ai) < O(Aj)

– Action can be coded as output port.

Introduction

ApplicationArchitectures

NetworkCircuit

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Practical architectures– Caching.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

Introduction

ApplicationArchitectures

NetworkCircuit

Page 6: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Practical architectures– Caching.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

Introduction

ApplicationArchitectures

NetworkCircuit

Linear Search

• CT : O(n.d), CS : O(n.d), insertion O(1).

If @S=A and @D=C and SP=C and DP=D then permitIf @S=A and @D=B and SP=C and DP=E then denyIf @S=A and @D=B then permitIf @S=A and @D=C then deny

Flow

A C C C

Deny

Introduction

ApplicationArchitectures

NetworkCircuit

Content Addressable Memory

00d003eae45f

CAMlocation

SRAMResult

|location|

Match/ No match

• CAM – Extension of lookup answer.

Introduction

ApplicationArchitectures

NetworkCircuit

Content Addressable Memory

• Internal structure

00d003eae40000d003eae40100d003eae43d00d003eae44e00d003eae45f

00d003eae45f

...

Memory content

OR

=?=?=?=?=?

=?

CAM

match/no match

Encode Location

Introduction

ApplicationArchitectures

NetworkCircuit

Page 7: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Ternary CAM

TCAMxbits bitspattern

Address

x bits pattern PolicyAllowing joker bits.

1?000?10?111?0?

• Ternary CAM behaviour

Introduction

ApplicationArchitectures

NetworkCircuit

• TCAM limitation– Problem: Only work with dots (x-tuple) not ranges

• Sub-Problem: TCAM use masks, rules use ranges.

• Example a single access list using port range 1024-1028

0000 0010 0000 0000

0000 0010 0000 0001 0000 0010 0000 00??

0000 0010 0000 0010 0000 0010 0000 0?00

0000 0010 0000 0011

0000 0010 0000 0100

• Generates 2 entries in TCAM !

Ternary CAM

Introduction

ApplicationArchitectures

NetworkCircuit

Ternary CAM

• TCAM limitation– Problem: Gets worse when concatenating patterns:

– E.g.: 10<x<14, 10<y<14x y

101? 101?

110? 110? 9 different patterns

1110 1110

– Up to (2w-2) d resulting patterns.

– Known as prefix blow-up.

Introduction

ApplicationArchitectures

NetworkCircuit

Content Addressable Memory

• CAM advantages:– Fast: 6-10ns access time.

– Simple.

– Multi-fields.

• CAM limitations: – Chip complexity: Grows linearly with

number of entries.

– Power consumption: 7-15w/MB (SRAM 0.25w/MB)

– Size: 2MB (SRAM ~8-16MB)

– Cost: 100$/MB (SRAM ~5$/MB)

– Number of entries (Size/pattern size)

– Pattern size (0-~500 bits)

Introduction

ApplicationArchitectures

NetworkCircuit

Page 8: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Practical architectures– Caching.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

Introduction

ApplicationArchitectures

NetworkCircuit

Caching

• Take advantage of temporal redundancies.

Rule Classifier

SlowMemory

CacheMemory

Flow@S,@D,PS,PD

Hit

Miss

Update (LRU)

Hit

Introduction

ApplicationArchitectures

NetworkCircuit

Caching

• Why does it work?– Temporal redundancies:

• Packets often belong to a flow– Set of packet with similar characteristics.

» (Source, Destination) x (Address, Port).

• Knowing which action applies to the first packet provides the answer to any following packet with same characteristics.

– In practice, filtering devices often see the same communications:• Ex: permit tcp src-ip our_net dst-ip any src-port >1024 dst -port 80

1. Often apply the same rules.

2. Rules often apply to the same fields.

Introduction

ApplicationArchitectures

NetworkCircuit

Caching

• What type of cache ?– Content:

• Rules: (packet classification algorithm unchanged).– Latest rules are kept in faster memory.

» Registers, L1 cache, L2 cache…

• Instantiated rules: (packet classification is changed).– Latest connection are kept in faster memory.

– Size.

– Update algorithm.• LRU.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 9: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

• Practical architectures– Caching.

Introduction

ApplicationArchitectures

NetworkCircuit Multi-dimensional

radix tries• Binary trie - base-2 radix trie.

– Mono-dimensional search structure.• Vertexes carry 1 bit value, leaves carry stored values.

• Searching the trie = finding a suite of vertexes that lead to a valid leaf.

– Example:

Rule Source AddrR1 101*R2 10*R3 1*R4 01*

R3

R2

R4R1

Introduction

ApplicationArchitectures

NetworkCircuit

Multi-dimensional radix tries

• Multidimensional tries.– Extension to binary tries.

• Pb: In fields 1, 2, …we may have overlap between intervals.

• Example

Rule Source Addr Dest. AddrR1 101* 01*R2 10* 10*R3 1* 0*R4 01* 111*

• 101* ⊂10* ⊂ 1*

001 010 011 100 101 110111

001

010

011

100

101

110

111

ExtensionsExamples

Introduction

Conclusion

TheoryIssues

Exist. App.Related Pb.

Introduction

ApplicationArchitectures

NetworkCircuit Multi-dimensional

radix tries• Superset propagation.

– If I i,k ⊂ I j,k then create Tk+1 using Ii,k+1 and Ij,k+1

– Last match is the good one.

R3

R2

R4 R1

Rule Source Addr Dest. AddrR1 101* 01*R2 10* 10*R3 1* 0*R4 01* 111*

R4 R2

R3R3

R2

R3

R1

0

1

*

*

Ex: 101, 000

Introduction

ApplicationArchitectures

NetworkCircuit

Page 10: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Multi-dimensional radix tries

Rule Source Addr Dest. AddrR1 101* 01*R2 10* 10*R3 1* 0*R4 01* 111*

R3R2

R4R1

R4

R1

R3

R2

Depth first search and backtrack in the first trie if no match to explore every possibilityCost: O(w2)

• Backtracking

Ex: 101, 000

Introduction

ApplicationArchitectures

NetworkCircuit

Grid of Tries

Rule Source Addr Dest. AddrR1 101* 01*R2 10* 10*R3 1* 0*R4 01* 111*

R3R2

R4R1

R4

R1

R3

R2

00

1Using shortcuts

Time Complexity O(w)Memory Complexity O(N . w)

Would that work for larger dimensions ?

• Two dimensional structure.– Idea: Use shortcut to prevent backtracking.

• Srinivasan & al. [SIGCOMM98]

Ex: 101, 000

Introduction

ApplicationArchitectures

NetworkCircuit

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

• Practical architectures– Caching.

Introduction

ApplicationArchitectures

NetworkCircuit

R0

Bitmap Intersection

• Ex: Laksham, Stiliadis [SIGCOMM98], TC : O(log(2n+1)), SC : O(d.n2)).

110 111&

110

010

110

010

011

001

000

000 001 101 111 011 010

2n + 1 n fields array R1

Log(n) +1

Log(2n+1) +1Binary Tree

Search

Flow

R1

R2

X

Introduction

ApplicationArchitectures

NetworkCircuit

Page 11: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Bitmap Intersection

000

01 11 10

Original

New

001 101 111 011 010000

11 01

• Optimizing bitmap storage.– 2N+1 changes in the bitmap in each dimension at most.

– Usually 1 bit change between 2 adjacent intervals.

– Idea: • Store original bitmap and then the location of the bit that changed.

• Bit changed can be stored on log(N) bits (since the size of each bitmap is N).

– Example:

Introduction

ApplicationArchitectures

NetworkCircuit

Packet Classification

• Practical methods– Linear search.

– Ternary CAM.

• Other methods– Lookup methods extensions.

• Grid of tries [Srin99]

• Fat Inverted Segment Tree [Fel00]

– Parallelization:• Bitmap intersection [Lak98]

– Heuristics:• Recursive Flow classification [Gup99]

• Hierarchical Intelligent Cuttings [Gup99]

– General improvements:• Heap on trie [Gup00]

– Specific instances of the problem.

– Many more...

• Practical architectures– Caching.

Introduction

ApplicationArchitectures

NetworkCircuit

Recursive Flow Classification

• Gupta, McKeown [SIGCOMM99]

If @S=A ET @D=B ET PS=C et PD=D

If @S=B ET @D=A ET PS=D et PD=C

If @S=A ET @D=B ET PS=F et PD=D

If @S=B ET @D=A ET PS=D et PD=F

0 0 00 00

1 1 01 01

0 0 10 00

1 1 01 10

0 0 00 00

1 1 01 01

0 0 10 00

1 1 01 10

0 00

1 01

0 10

1 11

00

01

10

11

Introduction

ApplicationArchitectures

NetworkCircuit Recursive Flow

Classification

• TC : O(1), SC : ?, Insertion 10s

A

B

0

1

A

B

1

0

@S @D

CF

00

10

PS

D 01

CF

01

10

PD

D 0

packet (A,B,F,D)

00

11

0

1

@

0000

0110

0001

P

1011

10000101

000

111

0001

R

1011

010101Step 1

Step 2

Etape 3

0

0

10

00

0

10

10 R2R2

Introduction

ApplicationArchitectures

NetworkCircuit

Page 12: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Network level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• Without state maintenance.– RFC 1858.

– RFC 3128.

• With state maintenance.– With reassembly

• RFC 791.

• RFC 815.

– Without reassembly

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• RFC 3128 (eg: most routers).– Filters packets depending on the packet characteristics without keeping a

state.

– IP packets with less than 8 bytes of data have to be dropped.• 8 bytes is the minimal size in RFC 791.

• Prevents inability to classify first IP packet due to lack of information.

– TCP packets with 8 bytes of data have to be dropped.• Prevents packets without TCP flag information.

– TCP packets with offset=1 (8 bytes offset).• Prevents flag information to be reassembled differently between packet filter and

end host.

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• RFC 3128(eg: most routers).– Pro:

• Cheap: Does not necessitate state maintenance in the packet filter.

• Efficient: Insures that transport level information is considered in a non ambiguous way.

– Cons:• Based on the assumption that end host will drop packets when the transport

information cannot be obtained.– Some packets are not filtered.

» Ex: Send only fragments with Offset>1, Data size > 8 and no transport information.

– Some attacks can take advantage vulnerable reassembly mechanisms:

» DoS attacks on memory.

» Buffer overflows using offset larger than 65kBytes.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 13: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Fragmentation Management

IP

TL=1000

M=0

OFF=0

ID=1234

Deny

TCP

Permit

TCPIP

TL=56

M=1

OFF=20

ID=1234

Deny

TCPIP

TL=56

M=1

OFF=20

ID=1234

• RFC 3128 (eg: most routers).– Cons:

• Application level information can still be reassembled ambiguously.

Windows NT BSD

IP

TL=1000

M=0

OFF=0

ID=1234

Permit

TCP

Permit

TCPIP

TL=56

M=1

OFF=20

ID=1234

Deny

TCPIP

TL=56

M=1

OFF=20

ID=1234

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• With reassembly (eg: pf).RFC 815. – Reassemble context lookup.

Reassemble Context

[Src_addr,Dst_addr,Proto,ID]

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• With reassembly (eg: pf). RFC 815.

Frag->first=OFF*8Frag->last=OFF*8+TLFrag->M

64 kbytes

Hole->first=64324Hole->last=65324Hole->previousHole->next

Hole->first=512Hole->last=1024Hole->previousHole->next

512 1024 64324 65324

Fragments can be stored in buffer(fragment size>Hole struct)

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

Introduction

ApplicationArchitectures

NetworkCircuit

• With reassembly (eg: pf). RFC 815.For(hole = hole_head; hole != NULL; hole = hole->ne xt) {

if (frag->first>hole->last) continue;

if (frag->last<hole->first) continue;

hole_tmp = new(HOLE);

if (frag->first>hole->first) {

hole_tmp->first = hole->first;

hole_tmp->last = frag->last-1;

}

if (frag->last<hole->last) {

hole_tmp->first = fragment->last+1;

hole_tmp->last = hole->last;

}

attach(hole_tmp); free(hole);

}

Page 14: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Fragmentation Management

• Without reassembly (eg: ipfilter)– Drop fragments as long as no transport level information is in the cache.

– Drop fragments that do not comply with RFC 3128.

– When receiving fragment with full transport level information store transport information and fragmentation information in the cache.

– Fragments are then matched to the content of the cache.• Verify that fragments do not overlap – crop/drop overlapping fragments.

• Verify that fragments are not duplicated – drop duplicates.

• Verify that fragments will not take advantage of known vulnerabilities.

Introduction

ApplicationArchitectures

NetworkCircuit

Fragmentation Management

• Without reassembly (eg: ipfilter)– Pros

• Prevents some forms of attack.

• More resilient to DoS attacks.

– Cons• Does not insure a non ambiguous reassembly at the end (depends on amount of

cropping done in the filter).

• Assumes reordering is always illegitimate.– Packets can be reordered naturally.

» Packets follow different paths in the network.

» Buggy IP stacks (e.g. linux 2.4) send fragments in reverse order.

Introduction

ApplicationArchitectures

NetworkCircuit

Circuit level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Circuit Level Filters

• Two main types of architectures:– Circuit level proxy.

• Usually implemented in user space.• Proxy acts as a server for the client, as a client for the server.• Usually not transparent.

– Stateful packet filter.• Extension to packet filters, implemented in the kernel.• Mostly Transparent.

– Similarities:• Try to avoid some attacks

– Packets/segment insertion/evasion.– Fingerprinting.

by using transport level information.and a state

• Can bind user with connection if associated with authentication.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 15: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Circuit Level Filters

– Circuit level proxy

TCP

IP

Proxy

PCB PCB

State

Incoming Connection

Outgoing Connection

Introduction

ApplicationArchitectures

NetworkCircuit

– Stateful packet filter

Stateful Filter

Connection

State

• State location

Circuit level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

State automaton for IPTables

• State checking– What type of state are we talking about ?

• Connection exists between hosts.

• State automaton.– Maintain a state for each side.

– Reject packets that do not satisfy TCP state automaton transitions.

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering• State checking

– What type of state are we talking about ?• TCP Initial Sequence number exchange.

Client Server

SYN 4982/32

SYN, 1835/32 ACK 4983/32

ACK 1836/32

SYN 4982/32

SYN, 1835/32 ACK 4983/32

ACK 1836/32

Client Server

SYN 4982/32

SYN, 1835/32 ACK 4983/32

ACK 1836/32

SYN 88746/32

SYN, 589632/32 ACK 88747/32

ACK 589633/32

Accepted Rejected

ACK 4984/32 ACK 4984/32

Introduction

ApplicationArchitectures

NetworkCircuit

Page 16: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Stateful packet filtering• State checking

– What type of state are we talking about ?• TCP Initial Sequence number evolution.

– Common practice up to 1995: RFC 793» Increase ISN by one every 4 microseconds.

– Alternative practice: 4.4BSD» Increase ISN by 64000 every 1/2 second.

• 1995: Steve Bellovin paper about connection Hijacking.– If ISN is easy to guess it is easy to find segments that can be accepted by the

receiver (and a firewall).– If ISN follows a well defined evolution pattern you can infer new ISN from

older ones.

• “Most” systems now use RNG to generate ISN.• However slightly increases the chance to have old packets accepted.

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering• State checking

– What type of state are we talking about ?• TCP Sequence number guessing.

Windows 98 SE

x[t] = seq[t] - seq[t-1]

y[t] = seq[t-1] - seq[t-2]

z[t] = seq[t-2] - seq[t-3]

OpenBSD 2.8

Author: Michal Zalewski

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering• State checking

– What type of state are we talking about ?• A reminder about TCP flow control mechanisms:

SYN J/32

SYN, K/32 ACK J+1/32

J+1:J+Ws+1/32,ACK K+1/32

K+1:K+W d+1/32 ACK J+Ws+1/32

J+Ws+1:J+2Ws +1/32,ACK K+Wd+1/32

K+W d+1:K+2Wd+1/32 ACK J+2Ws+1/32

...

ss ss+ns

ackdsd sd+nd

acks

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering• State checking

– What type of state are we talking about ? • Filtering process: Source side.

ss+ns <= max (ack d+Ws)

ss >= max (ack d) >= max(s s+ns) - max(W s)Data

Ackack d <= max (s s+ns)

ack d >= max (s s+ns) - MAXWIN

K+1:K+W d+1/32 ACK J+Ws+1/32

J+Ws+1:J+2Ws +1/32,ACK K+Wd+1/32

K+W d+1:K+2Wd+1/32 ACK J+2Ws+1/32

ss ss+ns

ackdsd sd+nd

acks

Introduction

ApplicationArchitectures

NetworkCircuit

Page 17: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Stateful packet filtering

• State checking– What type of state are we talking about ?

• Filtering process: Destination side.

sd+nd <= max (ack s+Wd)

sd >= max (ack s) >= max(s d+nd) - max(W d)Data

Ackack s <= max (s d+nd)

ack s >= max (s d+nd) - MAXWIN

K+1:K+W d+1/32 ACK J+Ws+1/32

J+Ws+1:J+2Ws +1/32,ACK K+Wd+1/32

K+W d+1:K+2Wd+1/32 ACK J+2Ws+1/32

ss ss+ns

ackdsd sd+nd

acks

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

• State checking– What type of state are we talking about ?

• What about connectionless protocols (UDP, ICMP) ?– Automaton approach still applies.– Timers play a bigger role to match traffic to an existing communication.

» “Connection“ timeout.

Client ServerRejected

64

1024

1024

1024

1024

1024

Timer

Introduction

ApplicationArchitectures

NetworkCircuit

Circuit level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

• State lookup– An opportunity to Improve performance

• Packet classification is hard.• It would be nice if we could avoid it.• Ideas:

– The same action should be applied to all packets belonging to the same connection.– Some parameters are known (source, dest.) x (address, ports) after first packet is received.– We can change part of the problem from matching rules made of intervals to matching

rules made of single values.– We need to know if a packet belongs to a connection.

• Number of such values depends on the number of concurrent connections, not the number of rules.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 18: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Stateful packet filtering

• State/Rule classification relations

Classifier

RuleClassifier

StateClassifier

Flow@S,@D,PS,PD

Hit

Miss

Update (LRU)

Hit

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

• State lookup– An opportunity to Improve performance

• Popular data structures:– Trees: balanced tree (e.g. pf), Splay trees (e.g. stream4).

» Average/worst complexity: Search O(log n), update O(log n).

– Linked Lists.

» Average/worst complexity: Search O(n), update O(1).

– Hashing (e.g. ipfilter).

» Average complexity: Search O(1), update O(1).

Introduction

ApplicationArchitectures

NetworkCircuit

Hashing

(@S,@D,SP,DP)Hash

Function

Pointern bits

Class. Rule

2n array

Actionvalue

Policy storage

(@S,@D,SP,DP) SRAM ActionHashFunction

pointer

Packet classification

SRAM

• Hash function + SRAM.

– Temporal complexity O(1)/O(n), Spatial Complexity O(2n).

– 64 Mbits, 4-10 ns.

Introduction

ApplicationArchitectures

NetworkCircuit

Hashing

(@S,@D,SP,DP)Hash

Function

Pointern bits

Class. Rule 1

2n array

Policy storage SRAM

(@S ’,@D ’,SP ’,DP ’)

Class. Rule 2

Since n < rule size we can get some collisions !You need to linearly test every rule in the collision queue !

R1 R2

queue

• Hash function limitations

Introduction

ApplicationArchitectures

NetworkCircuit

Page 19: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Hashing

(@S,@D,SP,DP)Hash

Function

Pointern-1 bits

Class. Rule 1

2n-1 array

Policy storage

SRAM

(@S ’,@D ’,SP ’,DP ’)

Class. Rule 2

R1 R2

queue

2n-1 array

R2

queue

H1

H2

Fill the queue with the less elements

• Usual tricks with hash functions

– Use several functions

• Andrei Broder, Michael Mitzenmacher, INFOCOM 2000

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

• State lookup– Question:

• Can we limit ourselves to searching the connection matching (src addr, src port, dstaddr, dst port) ?

– First problem: Returning packets.• Carry (dst addr, dst port, src addr, src port)

Introduction

ApplicationArchitectures

NetworkCircuit

Stateful packet filtering

• State lookup– Second problem: 192.168.200.1

TFTP Client: 1042

192.168.204.2TFTP server: 69

Joe Alice

192.168.200.1 192.168.204.2

1042 69

Introduction

ApplicationArchitectures

NetworkCircuit

192.168.200.1TFTP Client: 1042

192.168.204.2TFTP server: 69

Joe Alice

192.168.204.2 192.168.200.1

ICMP Port unreachable

X

UDP Packet

1

2

Stateful packet filtering

• State lookup– Second problem:

• We would like to be able to accept the returning ICMP if it can be associated with an outgoing connection.

– Need to parse states that may not be directly related to incoming packets.

• Should work with other problems– ICMP network unreachable (source different).

– FTP (Protocol different).

Introduction

ApplicationArchitectures

NetworkCircuit

Page 20: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Circuit level Filter

• Reminder

Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol Semantics, Mark Handley and al., USENIX Security 2001.

Introduction

ApplicationArchitectures

NetworkCircuit

Normalization• IP Level

# IP Field Normalization Performed1 Version Non-IPv4 packets dropped.2 Header Len Drop if hdr len too small.3 Header Len Drop if hdr len too large.4 Diffserv Clear field.5 ECT Clear field.6 Total Len Drop if tot len link layer len.7 Total Len Trim if tot len link layer len.8 IP Identifier Encrypt ID. 9 specific protocols. Protocol Pass packet to TCP,UDP,ICMP handlers.10 Frag offset Reassemble fragmented packets.11 Frag offset Drop if offset + len 64KB.12 DF Clear DF.13 DF Drop if DF set and offset 0.14 Zero flag Clear.15 Src addr Drop if class D or E.16 Src addr Drop if MSByte=127 or 0.17 Src addr Drop if 255.255.255.255.18 Dst addr Drop if class E.19 Dst addr Drop if MSByte=127 or 0.20 Dst addr Drop if 255.255.255.255.21 TTL Raise TTL to configured value.22 Checksum Verify, drop if incorrect.23 IP options Remove IP options. 24 IP options Zero padding bytes

Introduction

ApplicationArchitectures

NetworkCircuit

Normalization• ICMP Level

# ICMP Type Normalization Performed1 Echo request Drop if destination is a multicast or broadcast address.2 Echo request Optionally drop if ping checksum incorrect.3 Echo request Zero “code” field.4 Echo reply Optionally drop if ping checksum incorrect.5 Echo reply Drop if no matching request.6 Echo reply Zero “code” field.7 Source quench Optionally drop to prevent DoS.8 Destination Unr. Unscramble embedded scrambled IP identifier.

Introduction

ApplicationArchitectures

NetworkCircuit

Normalization• TCP Level

# TCP Field Normalization Performed1 Seq Num Enforce data consistency in retransmitted segments.2 Seq Num Trim data to window.3 Seq Num Cold-start: trim to keep-alive.4 Ack Num Drop ACK above sequence hole.5 SYN Remove data if SYN=1.6 SYN If SYN=1 & RST=1, drop.7 SYN If SYN=1 & FIN=1, clear FIN.8 SYN If SYN=0 & ACK=0 & RST=0,drop.9 RST Remove data if RST=1.10 RST Make RST reliable.11 RST Drop if not in window.12 FIN If FIN=1 & ACK=0, drop.13 PUSH If PUSH=1 & ACK=0, drop.14 Header Len Drop if less than 5.15 Header Len Drop if beyond end of packet.19 Window Remove window withdrawals.20 Checksum Verify, drop if incorrect.23 URG If URG=1 & ACK=0, drop.24 MSS option If SYN=0, remove option.25 MSS option Cache option, trim data to MSS.26 WS option If SYN=0, remove option.

Introduction

ApplicationArchitectures

NetworkCircuit

We’ll look at thisone on the next slides

Page 21: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

TCP Reassembly Problems

• Retransmissions – Classic timeout

A (x-x+s)

B (x+s-x+2s)

C (x+2s-x+3s) Ack x+s

Ack x+s

B (x+s-x+2s)

A

BC

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Reassembly Problems

• Retransmissions – Fast retransmit

A (x-x+s)

B (x+s-x+2s)

C (x+2s-x+3s)

D (x+3s-x+4s)

Ack x+s

Ack x+s

Ack x+sE (x+4s-x+5s)

Ack x+s

B (x+s-x+2s)

A

BCDE

3 duplicates AcksNo timeout

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Reassembly Problems• Retransmissions – Fast retransmit – why 3 ?

A (x-x+s)

B (x+s-x+2s)

C (x+2s-x+3s)Ack x+s

Ack x+s

Ack x+3s

A

BC

1 duplicate AckNo timeoutNo Loss

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Reassembly Problems

• Can we remove all ambiguities ?– Retransmission/Overlap alternatives:

• Discard data that has already been seen.– Pb: If data is lost between filter and destination, retransmission is dropped,

» transmission is no longer reliable.– You can guess the need for a retransmission by looking at 3 duplicate acks and validate

segments carrying ack value.» Pb: Does not work on retransmission with classic timeout.

• Discard data that has already been acknowledged.– Pb: If ack is lost after filter, sender retransmits at timeout.

» Risk to have connection total timeout + reset. – Pb: sender can simultaneously send several time the same segment with different content.

» Receiver can still reassemble data differently from filter.» does not resolve ambiguities !!

Introduction

ApplicationArchitectures

NetworkCircuit

Page 22: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

TCP Reassembly Problems• Can we remove all ambiguities and be non blocking ?

– Not really � !• But we can remove some of them.• We have the same problem for any stateful protocol �

– Answers:• Use proxy like architecture where traffic is acknowledged locally.• Use knowledge about the destination to reassemble its way. (active mapping or

configuration). (Not sufficient)• Use additional traffic between filter and destination to know state:

– Send additional RST to make sure RST arrived at destination.– Use TCP keepalive to know if data packet was received.– not sufficient – (e.g. URG data problem).

• Rewrite TCP segments to ensure unambiguous retransmission.– Very costly and not sufficient – (e.g. URG data problem).

• Generate alerts on suspicious events.– Overlaps are frequent but overlaps with different content aren’t.

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Reassembly Problems

• Question:– Some Firewall Manufacturer makes a firewall that looks for signatures {Tk} k=0..s of attacks

in packets. He claims he can solve the overlap problem by just considering the two possible reassembled strings. If Si and S’i have been received between the beginning of the TCP flow S0S1…Si-1 and the end of the TCP flow Si+1Si+2…Sn. He claims he just needs to consider both strings:

S0S1…Si-1 Si Si+1Si+2…Sn

S0S1…Si-1 S’i Si+1Si+2…Sn

And perform the comparison with {Tk} k=0..s

Is that a valid approach ?

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Stream Reassembly• Enforce Data consistency

– Problem somehow similar to packets reassembly.– But segmentation problem much more frequent.

• Fragmented traffic: ~0.5% Shannon & al. ACM TON 2002.• Segmented traffic: Any TCP traffic (75-80% existing traffic).

– Duplicate fragments are not supposed to exist.

– Is TCP reassembly costly ?• Number of states. Depends on:

– Number of concurrent connections.– Average number of holes per connection.– Size of holes.– Holes duration.

– How do holes appear ?• Packet Loss.• Reordering (routes changes/parallel path network processors).• Attacks.

Introduction

ApplicationArchitectures

NetworkCircuit

TCP Reassembly cost• Dharmapurikar & Paxson, USENIX Security 2005:

– 3-20% connections include holes.

– 95-97% connections include one hole.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 23: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

• Dharmapurikar and Paxson, USENIX Security 2005:– Possible attacks:

• Generate multiple holes per connection.– Limit number of concurrent holes per connection.

• Generate multiple connections with reduced number of holes:– Limit number of connections with hole per IP address.

– Store data once connection is established.

• Generate reduced number of connections with holes per IP address.– Reclaim memory by freeing random connection state.

– Cost (limiting possibility of attacks) :• 30kB-120kB/connection.

TCP Reassembly cost

Introduction

ApplicationArchitectures

NetworkCircuit

Circuit level Filter

• Reminder

Introduction

ApplicationArchitectures

NetworkCircuit

Circuit level Filters Position

• Proxy

Destination

Proxy

Browser

Configuration ServerDHCP Server

Filter

DNS Server

Introduction

ApplicationArchitectures

NetworkCircuit

• Stateful Packet Filter

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Directing flows to the proxy– Manual configuration

• Administrator or user specifies proxy address and port to application.

– Automated Configuration• Mostly for browsers.

• Using configuration scripts– Proxy Auto-Config

– Web Proxy Autodiscovery Protocol.

– Transparent proxying

Page 24: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Proxy specific problems

1. Administrator/User configures browser with configuration server address.

2. When browser is launched configuration scripts is obtained from server.

3. Script is cached and used for each request.

4. For every request script is executed and provides proxy address.

5. Request is sent to the proxy.

Introduction

ApplicationArchitectures

NetworkCircuit

• Proxy Auto-Config– Proxy.pac file (independent from browser)– Javascript program providing proxy address depending on destination address.– Obtained from web server provided through browser configuration.– You still need to configure browsers �

Proxy specific problems

• Web Proxy Auto discovery Protocol. – Automates web server configuration address retrieval.

– Using DHCP (preferred)1. DHCP Discover with Tag #252 (WPAD auto-proxy-config)

2. DHCP Offer with Tag #252.

3. DHCP Request or DHCP Inform if DHCP Information already obtained.

4. DHCP Ack with Tag #252 and associated URL.

5. Use DNS to convert URL to IP address.

– Using DNS• Queries wpad.localdomain.name from leaf to root.

– wpad.rc105.lor.int-evry.fr/proxy.pac

– wpad.lor.int-evry.fr

– wpad.int-evry.fr

– Many associated security issues.

Introduction

ApplicationArchitectures

NetworkCircuit

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Directing flows from the proxy– Packets sent to the proxy.

– Proxy must obtain final destination address/port.• Fixed proxy configuration.

– Proxy sets up connection to predefined destination.

– Works well for dedicated reverse proxies (one proxy per service).

• Information passed by the client– In band transport.

» Protocol naturally carries duplicate information.

» Eg. HTTP

GET / HTTP/1.1Accept: */*Accept-Language: frAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Wind ows NT 5.0; Q312461)Host: www.google.comConnection: Keep-Alive

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Directing flows from the proxy• Information passed by the client

– Out of band transport.

» Separate protocol used to carry final destination information.

» Eg. SOCKS

» Transparent to user (except socks library configuration).

Application

Socket

TCP/IP

Ethernet

SOCKS Library

Socket

TCP/IP

Ethernet

Serveur SOCKS

Kernel

User SpaceApplication

Socket

TCP/IP

Ethernet

User ViewTrue Communications

SOCKS Library

Socket LibrarySocket Library Socket Library

• SOCKS commands:• Establish TCP connection.• Request Port binding.

• SOCKS responses:• Request Granted• Request Rejected/Failed

Page 25: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Directing flows from the proxy• Information passed by the client

– In band transport.

» Extended protocol used to carry final destination information.

» Eg. FTP Proxy

» Not Transparent to user.

ftp> open 157.159.100.21 2121Connected to 157.159.100.21.220 server ready - login pleaseName (157.159.100.21:paul_o):[email protected] password requiredPassword:230 login acceptedRemote system type is UNIX.Using binary mode to transfer files.ftp> ls200 ok, port allocated150 Here comes the directory listing.drwxr-xr-x 18 ftp ftp 4096 Apr 19 08:55 mirror1226 Directory send OK.

Aug 28 18:11:23 prius ftp.proxy[31009]: info: monitor mode: off, ccp: <unset>Aug 28 18:11:39 prius ftp.proxy[31009]: connected to server: ftp.int-evry.frAug 28 18:11:39 prius ftp.proxy[31009]: login accepted: [email protected]

User View Server Logs

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Transparency– Ability to direct packets to a proxy without knowledge from clients.

– Incoming packets carry true destination address.

– Simplifies end systems configuration.

– Does not apply to reverse proxying.

– Also known as intercepting proxy.

– Two alternatives:• Transparency through NAT.

• Transparency through internal redirection.

Proxy specific problems

• Transparency through NAT1. Client IP layer layer routes packet toward

destination.

2. NAT: Dst Address, Dst Port -> Proxy Address, Proxy port

3. IP layer routes packet toward proxy device.

4. Not much

5. Proxy must obtain destination address to establish connection1. In band redundant destination information

(e.g. HTTP).

2. Dialog with device carrying NAT.

6. NAT: Proxy Address, Proxy Port -> SrcAddress, Src Port

Introduction

ApplicationArchitectures

NetworkCircuit

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Transparency through internal redirection.

1. Client IP layer layer routes packet toward destination.

Proxy is bound to socket with special attribute (e.g. IP_TRANSPARENT).

2. Network level filter Forward packet to local socket. Packet remains unchanged. Structure associate with packet is marked as send to transparent proxy.

3. IP layer matches address/port and mark to local socket. (Not the traditional IP behavior)

4. Proxy is bound to local socket and receives packets. Learns destination address and port from socket to initiate communication with true destination.

5. NAT: Proxy Address, Proxy Port -> Src Address, Src Port.

Source: http://www.balabit.com/downloads/files/tproxy/README.txt

Page 26: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Proxy specific problems

Introduction

ApplicationArchitectures

NetworkCircuit

• Transparency through internal redirection (alternative).

1. Client IP layer layer routes packet toward destination.

Proxy is bound to special socket (e.g. divert).

2. Network layer filter forwards packet to local socket. Mark data structure of packet. Packet remains unchanged.

3. IP layer sends packet to special socket identified by mark. (Not the traditional IP behavior)

4. Proxy receives full packets. Learns true source/destination from packet content Creates packet with correct characteristics and send it. Proxy has to implement IP/UDP/TCP protocols.

5. Nothing: packet already carries correct source/destination information.

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

Dealing with encryption

• SSL Man in The Middle proxy

Introduction

ApplicationArchitectures

NetworkCircuit

1. TCP Connections with proxy1. TCP handshake between client and proxy.

2. TCP handshake between proxy and destination.

Dealing with encryption

• SSL Man in The Middle proxy

Introduction

ApplicationArchitectures

NetworkCircuit

3. SSL/TLS Negotiations1. Proxy gets certificate from destination server.

2. Proxy validates destination server certificate (uses cache as speed up).

3. Proxy negotiates shared key keyproxy_serverwith destination server.

4. Proxy provides locally generated certificate to client.

5. Client accepts certificate.

6. Proxy negotiates shared key keyclient_proxywith client.

Page 27: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Dealing with encryption

• SSL Man in The Middle proxy

Introduction

ApplicationArchitectures

NetworkCircuit

4. Encrypted traffic1. Client sends traffic to proxy.

2. Proxy decrypts traffic using keyclient_proxy.

3. Eventually changes traffic content to hide that proxy is used.• E.g. HTTP Proxy Connection: Keep-Alive -> Connection: Keep-Alive

4. Proxy encrypts traffic using keyproxy_server.

Dealing with encryption

Introduction

ApplicationArchitectures

NetworkCircuit

• SSL Man in The Middle proxy

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

Application level Filter

• Protocol Analysis: Two kinds of problems:– Given a flow of data determine which protocol is used (traffic recognition).– Given a flow of data determine the structure and value of the information

exchanged (message parsing).

Introduction

ApplicationArchitectures

NetworkCircuit

Page 28: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Traffic recognition

• Recent field of research due to high amount of obfuscation used by some applications (e.g. P2P).

• Existing approaches:– Port number – application mapping.

– Matching well known patterns associated with popular protocols.

– Using heuristics to separate different traffic classes.

Introduction

ApplicationArchitectures

NetworkCircuit

Port Mapping

• Some applications use well known port numbers.– Advantages:

• Simple to express.

• Speed.

Introduction

ApplicationArchitectures

NetworkCircuit

Source: A Longitudinal Study of P2P TrafficClassification, Alok Madhukar Carey Williamson, MASCOTS 2006

Up to 60% of the traffic unknown– Drawbacks:• Accuracy.

Pattern Matching

• From: protocol specifications/ protocol exchanges extract patterns characterizing protocol.– Eg: Kazaa (from l7-filter):

• ^get (/.download/[ -~]*|/.supernode[ -~]|/.status[ -~]|/.network[ -~]*|/.files|/.hash=[0-9a-f]*/[ -~]*) http/1.1|user-agent: kazaa|x-kazaa(-username|-netwo rk|-ip|-supernodeip|-xferid|-xferuid|tag)|^give [0-9][0-9][0 -9][0-9][0-9][0-9][0-9][0-9]?[0-9]?[0-9]?

• Maintains connections states.

• Examines packets independently.

• Once a packet matches the pattern, associate application tag with connection.

Introduction

ApplicationArchitectures

NetworkCircuit

Pattern Matching

• Advantages:– Accuracy (for P2P traffic) (source (1): Accurate, Scalable In-Network

Identification of P2P Traffic Using Application Signatures Subhabrata Sen & al, WWW 2005)

• Drawbacks:– Performance. (Source: High Speed Deep Packet Inspection with Hardware

Support, Fan Yu, PhD thesis, Nov. 2006)• Less than 10Mb/s for l7-filter on 750Mhz P3.

– False negative (source : (1)): 1-10% depending on protocol.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 29: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Pattern Matching

• Open Source Tools examples:– Mostly under linux.

– L7-filter: http://l7-filter.sourceforge.net/• ~70 signatures. Many with false positives/negatives.

– IPP2P: http://www.ipp2p.org/• Mostly P2P oriented. ~10 signatures.

• Is it useful ?– In controlled network environment applications should be configurable to use

well known ports and traffic to unknown ports should be dropped.

– Mostly useful in environment where users cannot be tightly controlled (ISP, university campus, …).

– Or to search tunneled traffic.

Introduction

ApplicationArchitectures

NetworkCircuit

Matching Transport layer patterns

• Problems with Pattern Matching approaches:– Expensive. Not (yet) fitted for very high bandwidth.

– Must access data in packets: privacy issues.

– Requires heavy changes in routers/packet filters architectures.

Introduction

ApplicationArchitectures

NetworkCircuit

• Ideas: Use information provided by Network/Transport layers.– Use flow monitoring.

– Use information widely associated with flows:• Number of packets per flow,

• Mean packet size, mean data packet size.

• Mean inter-arrival time of packets.

• …

– Use machine learning techniques • Supervised learning/Unsupervised learning.

∏=

=

=

=

==

=

ni

ii

nn

n

nn

n

n

n

nn

n

nn

CFPFFp

CpFFCp

FFp

FFFCFFpFFCFpFCFpCFpCpFFCp

FFp

FFCFFpFCFpCFpCp

FFp

FCFFpCFpCpFFCp

FFp

CFFpCpFFCp

111

1

32142131211

1

213121

1

1211

1

11

)/(),...,(

)(),...,/(

),...,(

),,,/,...,(),,/(),/()/()(),...,/(

),...,(

),,/,...,(),/()/()(

),...,(

),/,...,()/()(),...,/(

),...,(

)/,...,()(),...,/(

Matching Transport layer patterns

Introduction

ApplicationArchitectures

NetworkCircuit

• Internet Traffic Identification using Machine Learning, Jeffrey Herman & al. IEEE Globecom 06.– Example: Using naïve bayes classifier.

• Set of n variables describing communications (flows): Called features: F0...Fn

• Set of classes (e.g. HTTP, FTP, POP, P2P, …): C

• We want to compute P(C/ F1, …, Fn):

(Bayes theorem)

(independence assumption: p(Fi/C,Fj)=p(Fi/C)

(cond. prob.)

Matching Transport layer patterns

),...,/( 11 nn vFvFcCp ===

Introduction

ApplicationArchitectures

NetworkCircuit

• Internet Traffic Identification using Machine Learning, Jeffrey Herman & al. IEEE Globecom 06.– Example: Using naïve bayes classifier.

• When receiving a flow caracterized by F1=v1,…Fn=vn look for c maximizing:

• P(Fi=vi/C=c) can be easily obtained using labeled training data:

• Naïve Bayes classifiers provide ~80% correct classification.

– Using more complex classification techniques can yield 90-95% correct classification.

cC

cCvFcCvFp

ii

ii ==∩=

=== )/(

Page 30: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Matching Transport layer patterns

• Can we infer more than the protocol type ?– Once you know the protocol you can apply similar techniques to obtain

application level information.

Introduction

ApplicationArchitectures

NetworkCircuit

– Example: Requin a tool for fast web traffic classification. Paul & al. IEEE Globecom 05.

• Using bayesian networks.

• For HTTP parameters matching.

• Alternatives– Heuristics on port/addresses relations (BLINC, multilevel traffic classification

in the dark, Karagiannis & al. SIGCOMM 05).

Protocol Messages Parsing

• Protocols are specific forms of languages.– Formal constructs that are used to parse languages can be used to parse

protocols.• State automaton, formal grammars …

– Automated techniques can be build parsers from formal descriptions.

Introduction

ApplicationArchitectures

NetworkCircuit

InformalMessages

Description

FormalDescription

ParserCode

ParserExecutable

ProgrammerParser

generator compiler

– This part just takes care of the lexical/syntactical validity of messages received.

– Many parsers are built by hand for performance reasons.

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

Normalization

• Usually written by hand.

• Main protocols:– HTTP:

• % encoding (& double encoding).

• Unicode encoding.

• Directory traversal.

• Upper-lower case.

• Whitespace/Line feed

– SMTP/DNS:• Unicode encoding.

• Upper-lower case.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 31: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Normalization

• Main protocols:– Telnet/FTP/SSH

• Control Codes (Backspace, Bell, CR, tabulation, …).• Commands (set of codes that are use to control the communication: termination,

echo mode, window size, flow control, set environment variables, …).

Introduction

ApplicationArchitectures

NetworkCircuit

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

Pattern Matching

• Several types of matching problems– String matching.

• String = finite sequence of characters over finite alphabet.

• Matching = search all occurrences of string over a text (longer string).

Introduction

ApplicationArchitectures

NetworkCircuit

DenyTest.htmlGETHTTP80>1024157.159.226.132*1ActionURIMethodProtoDst PortSrc PortDst AddressSrc AddressR#

Pattern Matching

• Several types of matching problems– Multiple strings matching.

• Multiple strings = Set of strings.• Matching = search all strings occurrences over a text reading the text once.

Introduction

ApplicationArchitectures

NetworkCircuit

PermitToast.htmlGETHTTP80>1024157.159.226.132*2DenyTest.htmlGETHTTP80>1024157.159.226.132*1ActionURIMethodProtoDst PortSrc PortDst AddressSrc AddressR#

Page 32: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Pattern Matching

• Several types of matching problems

Introduction

ApplicationArchitectures

NetworkCircuit

– Regular expressions matching.• Regular expression = concatenation, union, repetitions of strings or regular

expressions.• Matching = search all regular expressions occurrences over a text reading the text

once.

Permit.*.htm|.*.htmlGETHTTP80>1024157.159.226.132*2Deny.*.htm|.*.htmlPOSTHTTP80>1024157.159.226.132*1ActionURIMethodProtoDst PortSrc PortDst AddressSrc AddressR#

– Why not just using regular expressions for every search ?

Pattern Matching

• Several types of matching solutions– Prefix based.

• Start from pattern beginning.• Check character by character if we have a match.

Introduction

ApplicationArchitectures

NetworkCircuit

A B D C A E F D Q A B D D C C D

A B D F– Suffix based.

• Start from pattern end.• Check character by character if we have a match.

A B D C A E F D Q A B D D C C D

A B D F– Prefix/Suffix based.

• Start from either end.• Check character by character if we have a match.

Single String Matching

• Naïve search (prefix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D C A B A B Q A B D F C C D

A B D F

– When we don’t have a match we shift by 1

w

A B D C A B A B Q A B D F C C D

A B D F

w

– Until number of character matched = size of the window.

A B D C A B A B Q A B D F C C D

A B D F

w

Single String Matching

• Morris-Pratt (prefix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D F

– In this case we can shift by 4 since there is no part in w that matches a prefix of the pattern.

w

A B D C A B A B Q A B D F C C D

A B D C A B A B A F B D F C C D

A B A F

– In this case we can only shift by 2 since a part of w matches the AB prefix of the pattern.

• We can start the search at B since A has already been recognized.

A B A F

Page 33: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Single String Matching

• Morris-Pratt (prefix based).– In general how much can we shift ?

• The number of characters to align the prefix to the suffix found.

Introduction

ApplicationArchitectures

NetworkCircuit

A B D A B F

A B D A B B A B Q A B D F C C D

w

i+1bb

Shift(i+1) = ((i + 1) – 1) – b = i – b

– The shift only depends on the content of the pattern !

Single String Matching

• Morris-Pratt (prefix based).– In general how much can we shift ?

• Pre-compute border b: the longest prefix v of pattern p that is also a suffix u during a search, u = v != p.

Introduction

ApplicationArchitectures

NetworkCircuit

A B D A b = A, if we match this pattern, we can shift 3

A B D A B b = AB, if we match this pattern, we can shift 3

• The largest shift we can make reaching character i of the pattern:– Shift[i+1] = i – length of the border[i], Shift[1] = 1.

– We can pre-compute a table giving the size of the shift if we have a mismatch.

A B D A B

1 1 2 3 3

b = /, if we match this pattern, we can shift 3A B D

A B A B

1 1 2 2

A

2

A B D A B F

F

3

B

2

A A A A A

1 1 1 1 1

F

1

Single String Matching

• Morris-Pratt (prefix based).– Example

Introduction

ApplicationArchitectures

NetworkCircuit

A B D C A B A B Q A B A F C C D

A B A

A

A B A F

1 1 2 2

A

A B A F

A B A F• Knuth-Morris-Pratt.– Avoid by noticing that D/Q can’t match A since the character tested is a

prefix of the pattern.

A B A

A

Single String Matching

• Boyer-Moore (suffix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D B A B A B Q C A B A B A D

G B A B

w

A

– Full Good suffix shift.• If we know that a suffix of the pattern is repeated within the pattern.

• We can shift the pattern to that position.

• Since we have a mismatch between A and D the character before the searched suffix must be different from A.

– If the pattern had been ABABA we could have shifted 5 characters.

A B D B A B A B Q C A B A B A D

G B A B A

w

Page 34: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Single String Matching

• Boyer-Moore (suffix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D B A B A B Q C A B A B A D

A C A B

w

A

– Partial Good suffix shift.• The full suffix is not repeated in the pattern but

• A part of the recognized text suffix is also a prefix of the pattern

• We want to align the prefix with this suffix.

• Full good suffix prevails over partial good suffix.

• There’s no need to align internal A since it is not preceded by B.w

A B D B A B A B Q C A B A B A D

A C A B A

Single String Matching

• Boyer-Moore (suffix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D B A B A B Q C A B A B A D

A D A D

w

A

– Full window Bad character shift.• The mismatched character (B) doesn’t occur in the rest of the pattern.

• We can shift by the length of the window.

• Any i position shift (i<w) would generate a mismatch.

w

A B D B A B A B Q C A B A B A D

A D A D A

Single String Matching

• Boyer-Moore (suffix based).

Introduction

ApplicationArchitectures

NetworkCircuit

A B D B A B A B Q C A B A B A D

B B A D

w

A

– Partial window Bad character shift.• The mismatched character occurs in the rest of the pattern.

• We want to align the first of these characters to the mismatched character.

• Any shorter shift would generate a mismatch.

w

A B D B A B A B Q C A B A B A D

B B A D A

Single String Matching

• Boyer-Moore (suffix based).

Introduction

ApplicationArchitectures

NetworkCircuit

– Combining the shifts.Max(

– GS(p) = min ( GSs(p): Good full suffix shift,

GSp(p): Good Partial suffix shift )

– BS(c) = max (BFs(p): Full window Bad character shift

BPs(p): Partial window Bad character shift )

)

– Example: Pattern = GCAGAGAG, Alphabet = G,C,A,T

G C A G

8 8 8 2

A G A G

8 4 8 1

G C A G

7 7 7 7

A G A G

7 7 7 7

G C A T

2 6 1 8

GSs GSp BS

The only suffix that is also a prefix is G

Non(G)AG appears is CAGNon(G)AGAG appears is CAGAG

Page 35: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Single String Matching

• Horspool (any order based).

Introduction

ApplicationArchitectures

NetworkCircuit

– Simplification of Boyer-Moore.

– Only uses bad character shift.

– Can work in any order (prefix or suffix based).

– Works well with large alphabets/small patterns.

– Reduces preprocessing and memory cost.

Multiple String Matching

• Aho-Corasick (prefix based).– Extension of KMP.– Builds automaton from set of searched patterns.

• Root node, Intermediate node• Transitions.• Example: “toast.html|test.html|test.htm|toast.htm|oast”

Introduction

ApplicationArchitectures

NetworkCircuit

0

1

4

t

3e

o

6 9s t

12 14 16. h t

18 20m l

7 10 13s t

15 17 19. h t

21 22m la

2o

5 8 11s ta

Multiple String Matching

• Aho-Corasick (prefix based).• Supply function.

– State reached when reading the longest prefix that is also a suffix of the window.

Introduction

ApplicationArchitectures

NetworkCircuit

0

1

4

t

3e

o

6 9s t

12 14 16. h t

18 20m l

7 10 13s t

15 17 19. h t

21 22m la

2o

5 8 11s ta

Multiple String Matching

• Aho-Corasick (prefix based).• Terminal nodes: Node reached when a pattern is recognized.

– A part of a pattern can be a pattern:» We can have multiple terminal nodes for a given pattern.» Any node pointing to a terminal node with the supply function is also terminal.

Introduction

ApplicationArchitectures

NetworkCircuit

0

1

4

t

3e

o

6 9s t

12 14 16. h t

18 20m l

7 10 13s t

15 17 19. h t

21 22m la

2o

5 8 11s ta

Page 36: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Multiple String Matching

• Aho-Corasick (prefix based).– Functions:

• Transition function T(node, character).• Supply function S(node).

– Searching occurrences:Current = root node.

While there is a character x to readWhile (T(Current, x)==0)&&(S(Current)!=0)

Current = S(Current);If (T(Current, x)!=0)

Current = T(Current, x);

ElseCurrent = root node;

If Current is terminal mark occurrence;

Introduction

ApplicationArchitectures

NetworkCircuit

Multiple String Matching

• Aho-Corasick (prefix based).• Search example: toatestoast.htes

Introduction

ApplicationArchitectures

NetworkCircuit

0

1

4

t

3e

o

6 9s t

12 14 16. h t

18 20m l

7 10 13s t

15 17 19. h t

21 22m la

2o

5 8 11s ta

Multiple String Matching

• Some other important algorithms– Horspool

• Efficiency reduces as the number of strings increases since the likelyhood of bad character shift decreases exponentially with number of strings.

– Wu-Manber• Improves over Horspool.• Uses character blocks in order to increase bad character shift efficiency.• Uses hash table in order to avoid storing |�|B shifting values. (B: number of characters

per block, �: Alphabet).

Introduction

ApplicationArchitectures

NetworkCircuit

Regular expression Matching

• Definition.– A regular expression is a string on the set of symbols �, { � , | , . , * , ( , ) } which

is recursively defined as the empty character�, a character� ∈ �, and (RE0) , (RE0 . RE1) , (RE0 | RE1) and (RE0*) where RE0 and RE1 are regular expressions.

• Example: ((A.B)*((A.C)|(A.D)) = ((AB)*((AC)|(AD))

• Usually (RE0 . RE1) is simplified to RE0RE1

Introduction

ApplicationArchitectures

NetworkCircuit

Page 37: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Regular expression Matching

• Definition.– The language represented by a regular expression RE is a set of strings over �

which is recursively defined as follows:• If RE is � then L(RE) = {�}, the empty string.

• If RE is � ∈ � then L(RE) = {�}, a one character string.

• If RE is of the form (RE0) then L(RE) = L(RE0)

• If RE is of the form (RE0 . RE1) then L(RE) = L(RE0) . L(RE1) where X.Y is the set of strings w=x.y with x ∈X and y ∈Y. L(RE) is the language formed by the concatenationof strings generated by languages L(RE0) and L(RE1).

• If RE is of the form (RE0 | RE1) then L(RE) = L(RE0) ∪ L(RE1). L(RE) is the language formed by the union of L(RE0) and L(RE1).

• If RE is of the form (RE0*) then L(RE) = L(RE0)* = ∪i�0 L(RE0)i where L(R0)

0 = {�} and L(R0)

i = L(R0).L(R0)i-1

Introduction

ApplicationArchitectures

NetworkCircuit

Regular expression Matching

• Regular expression matching process.

Introduction

ApplicationArchitectures

NetworkCircuit

Regular expression

Parse Tree NFA DFA

determinizationNFA constructionParsing

• Parse tree generation.

( ( ( A . B ) * ) . ( C | D ) )

A

.

B

|*

C D.

– Using lexical & syntactic analyzers.

– Building analyzer by hand.

Regular expression Matching

• Non-deterministic Finite state Automaton.– Thompson automaton

Introduction

ApplicationArchitectures

NetworkCircuit

( ( ( A . B ) * ) . ( C | D ) )

A

.

B

|*

C D.

A I FA

A B

.I F

AI F

B

I F��

C D

| I FC

I FD

IF

�I

F�

F� I�

A

* I FA

I F

� �

F I

Non determinism:• � transitions or• Several transitions from a state with the same label

Regular expression Matching

• Non-deterministic Finite state Automaton.– Thompson automaton

Introduction

ApplicationArchitectures

NetworkCircuit

( ( ( A . B ) * ) . ( C | D ) )

A

.

B

|*

C D.

I FA

I FB

I FC

I FD

IF

�I

F�

F� I�I

F

F

I� ��

I

A BC

DF

� �

� ��

Page 38: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Regular expression Matching

Introduction

ApplicationArchitectures

NetworkCircuit

Text to read: ABABAABC

I

1 2A

3B

5 6C

8D

F�

4�

7�

� ��

�Letters Red Active States/ I/ 1,5,7A 2AB 1,5,7ABA 2ABAB 1,5,7ABABA 2ABABAA Failure

• Non-deterministic Finite state Automaton.– Searching with Thompson automaton.

• Several states can be simultaneously active.

• The automaton only recognizes the language described by the regexp.

• We need transitions allowing us to be in initial state when a pattern is not recognized.

Regular expression Matching

Introduction

ApplicationArchitectures

NetworkCircuit

Text to read: ABABAABC

I

1 2A

3B

5 6C

8D

F�

4�

7�

� ��

�Letters Red Active States/ I/ 1,5,7,IA 2,IAB 1,5,7,IABA 2,IABAB 1,5,7,IABABA 2,IABABAA IABABAA 1,5,7,IABABAA 2,IABABAAB 1,5,7,IABABAABC F,I

• Non-deterministic Finite state Automaton.– Searching with Thompson automaton.

– The NFA recognizes ((((A|B)|C)|D)*.(((A.B)*).(C|D)))

A

BC

D

Regular expression Matching

• Deterministic Finite state Automaton.– � closure definition:

• The � closure of a state s in an NFA E(s) is the set of states of the NFA that can bereached from s with� transitions.

– � closure example:

Introduction

ApplicationArchitectures

NetworkCircuit

I

1 2A

3B

5 6C

8D

F�

4�

7�

� ��

E(I) = {I, 1, 4, 5, 7}, E(3) = {3, 1, 4, 5, 7}, E(4) = {4, 5, 7}, E(6) = {6, F}, E(8) = {8, F}

E(1) = {1}, E(2) = {2}, E(5) = {5}, E(7) = {7}

Regular expression Matching

• Deterministic Finite state Automaton.– E(I) is the initial state of the DFA.

Build_state(S)

For every � ∈ �, For every NFA states s ∈ S,

If there is a transition � between s and s’, s’ ∈ E(y) then

Add E(y) to DFA state T;

Build_transition(S,T, �);If (T has not been already treated) then

Build_state(T);

– A state is final if an NFA final state is included in it.

Introduction

ApplicationArchitectures

NetworkCircuit

E(I) E(4)E(6)

E(8)E(1)

E(2)

E(5)

E(7)E(3) E(I) = {I, 1, 4, 5, 7}, E(3) = {3, 1, 4, 5, 7},

E(4) = {4, 5, 7}, E(6) = {6, F},

E(8) = {8, F}, E(1) = {1}, E(2) = {2},

E(5) = {5}, E(7) = {7}

A

B

A

CD

C

D

Page 39: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Regular expression Matching

• Deterministic Finite state Automaton.– Searching

• We have the same problem we met using the NFA.

• The DFA should have been built from the modified NFA we saw earlier.

Introduction

ApplicationArchitectures

NetworkCircuit

E(I) E(6)

E(8)

E(2) E(3)

A

B

A

CD

C

D

Text to read: ABABAABC

C

B

B

E(I) E(6)

E(8)

E(2) E(3)

A

B

A

CD

C

D

A

D

String Matching Complexities

Best Case Worst Case Average Case Precomputation MemoryNaive O(n) O(n.m) O(n.m) O(0) O(0)MP/KMP O(n) O(n) O(n) O(m) O(m)Boyer-Moore O(n/m) O(n.m) O(n) O(m+|�|) O(m+|�|)Horspool O(n/m) O(n.m) O(n) O(m) O(|�|)Aho-Corasick O(n) O(n+nocc) O(n+nocc) O(|P|) O(|P|.|�|)Wu Manber O(n/m.B) O(n.m) O(n) O(|�|B) O(|�|B)NFA O(n.m) O(n.m) O(n.m) O(m) O(m)

DFA O(n) O(n) O(n) O(2m) O(2m)

Introduction

ApplicationArchitectures

NetworkCircuit

– Bit level parallelism:• Takes advantage of multi-characters comparison on modern processors (16-128

bits).– Divides search time by memory bus width.

• Transforms search in bitlevel operations.

• Complexities

n size of text ��� size of alphabet nocc number of occurrencesm size of pattern |P| sum of the sizes of the patterns in the set of patterns P B Size of blocks

String Matching Complexities

Introduction

ApplicationArchitectures

NetworkCircuit

• Performance– Lack of representative database for filtering patterns

• Usually evaluated using – snort pattern IDS database

– ClamAV pattern antivirus database.

– Software implementations• 0.1-2Gb/s.

• Usually using boyer-moore, aho-corasick and wu-manber.

• probabilistic techniques (Bloom filters, pre-filtering).

– Hardware implementations• 0.5-20Gb/s.

• Usually implement NFA/DFA optionally using TCAM.

• Usually evaluated using– Number Look Up Table/Stored bytes ratio.

– Number Look Up Table/Throughput ratio.

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

Page 40: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Application level Filter

• Overview

Filter

Filtering Policy

Application-LevelInformation

Pattern Matching

Incomming Packets

Authorized Packets

Internal Interface

External Interface

StateLookup

StateChecking

Protocol Analysis

Normalization

Introduction

ApplicationArchitectures

NetworkCircuit

State Checking

• Several aspects (complementary).– Checking that protocol messages sequencing is compliant with protocol

specification.

– Checking that message does not contain known attacks.• Similar to knowledge based IDS.

– Checking that protocol usage is compliant with legitimate users behavior.• Similar to behavior based IDS.

Introduction

ApplicationArchitectures

NetworkCircuit

State Checking

• Protocol messages sequencing checks.– Formal descriptions can be used to build code automatically and verify

protocol properties.• SDL (Specification and Description Language).

• LOTOS (Language Of Temporal Ordering Specification).

• …

Introduction

ApplicationArchitectures

NetworkCircuit

InformalProtocol

Description

FormalProtocol

Description

ParserCode

ParserExecutable

ProgrammerParser

generator Compiler

Authentication

• Motivations– Improve connection between users and addresses.

Authentication Server

� User authenticates to authentication server.

Authentication server relates IP address and user name to generate filtering rules according to users rights. Rules are sent to filtering module.

User communicates with external device.

Introduction

ApplicationArchitectures

NetworkCircuit

Page 41: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

Examples

• Ipfilter– http://coombs.anu.edu.au/~avalon/ip-filter.html.

– Works on most Unixes.

– Public source licence (Not BSD).

• Functionnalities– Stateless Packet filtering.

– True stateful packet filtering.

– NAT (NAT, PAT).

– Transparent proxying.

• Organization– Filtering rules stored in a configuration file.

– Software uses configuration file to configure filtering module.

Espace utilisateur

Noyau

Carte Ethernet

Driver Ethernet ARP

Telnet

Protocoles (IP, UDP, TCP, ICMP...)

Sockets

IPFilter

Ipf

Ipfstat

FTP

DNS

HTTP

InterfaceIPFilter

Introduction

ApplicationArchitectures

NetworkCircuit

IPFilter• Architecture

INV

+-------------------------+--------------------------+| V || Network Address Translation || | || authenticated | || +-------<---------+ || | V || V IP Accounting || | V || | Fragment Cache Check--+ || V V V || | Packet State Check-->+ || | +->--+ | | || | | | V | || V groups Firewall check V || | | | | | || | +--<-+ | | || +---------------->|<-----------+ || V || +---<----+ || function | || +--->----+ || V |

+--|---<--- fast-route ---<--+ || | V || +-------------------------+--------------------------+| || pass only| VV [KERNEL TCP/IP Processing]

Introduction

ApplicationArchitectures

NetworkCircuit

IPFilter

• Fragment cache check.– Ipfilter does not deal with packets when frag#0 does not contain transport

information.• These packets are not considered legitimate.

• Fragment #0 contains IP+Transport Information.

– A state is kept in the state table iff:• Fragment #0 matches classification “permit” rule.

• Rule indicates that fragment state should be kept.

– Fragment #n n>0 are• Matched against fragment states.

– If a match is found:» Used to update fragment states table.

» Either forwarded to destination bypassing other checks or dropped.

– If no match is found:» Dropped.

– All forwarded fragments are sent as is (no reassembly).

Introduction

ApplicationArchitectures

NetworkCircuit

IPFilter• Architecture

INV

+-------------------------+--------------------------+| V || Network Address Translation || | || authenticated | || +-------<---------+ || | V || V IP Accounting || | V || | Fragment Cache Check--+ || V V V || | Packet State Check-->+ || | +->--+ | | || | | | V | || V groups Firewall check V || | | | | | || | +--<-+ | | || +---------------->|<-----------+ || V || +---<----+ || function | || +--->----+ || V |

+--|---<--- fast-route ---<--+ || | V || +-------------------------+--------------------------+| || pass only| VV [KERNEL TCP/IP Processing]

Introduction

ApplicationArchitectures

NetworkCircuit

Page 42: Overview Filtering Architecturespaul_o/Courses/filters_arch.pdf · Cambridge University Press, 2002. • Handbook of exact string matching algorithms, Charras, ... • Hierarchical

IPFilter

V [KERNEL TCP/IP Processing]| || +-------------------------+--------------------------+| | | || | V || | Fragment Cache Check--+ || | | | || | V V || | Packet State Check-->+ || | | | || | V | |V | Firewall Check | || | | V || | |<-----------+ || | V || | IP Accounting || | | || | V || | Network Address Translation || | | || | V || +-------------------------+--------------------------+| || pass onlyV |+--------------------------->|

VOUT

• Architecture

Introduction

ApplicationArchitectures

NetworkCircuit

Conclusion

• Packet filters efficiency depends on many parameters.– Location.

– Filtering process.

– Administration.

– Interaction with other security components.

• Packet filters challenges.– Dynamic applications (no fixed ports).

– End to end encryption.

– Performance (Internal bus, states management, dynamic policies).

– New protocols (STCP, …)

Introduction

ApplicationArchitectures

NetworkCircuit

Thank You !