hackito ergo sum 2011: capture me if you can!
DESCRIPTION
My slides for the Hackito Ergo Sum 2011 conference in ParisTRANSCRIPT
Capture me if you can!
Sebastien Tricaud1
1Picviz Labs
Hackito Ergu Sum (Paris, France) 2011
1/54
$ whoami
• Sebastien Tricaud• Picviz Labs Director• Picviz Labs is the editor of Picviz Inspector, a data-mining
software for security• Honeynet Project CTO• 15 years of various IDS implementations
2/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
3/54
Context
Once upon a time. . .
Two days ago, at CERIAS, M. Neal Ziring said:
The attack data is often lost in the noise of events
4/54
Context
Once upon a time. . .Two days ago, at CERIAS, M. Neal Ziring said:
The attack data is often lost in the noise of events
4/54
Context
Mr. Neal Ziring is currently a technical director in theInformation Assurance Directorate (IAD), at NSA. The IADprovides cryptographic, network, and operational securityproducts and services to protect and defend national securitysystems.
5/54
Talk objective
How capture can be performed and managed to effectively findincidents1 in large networks.
1attacks, documents leaks, etc.6/54
Find incidents in large networks: Network traffic
1 Capture all the traffic2 Someone reports an incident3 Run Snort on the captured traffic
• Two countries examples:• 30 Gb Netflow Traffic for a 20 millions people country per
24 hours (about 1700 events/s; 510 000 events/5 mn)• 5 min Netflow Capture on the main backbone on a 45
millions people country: 3 millions events/5 mn
7/54
Find incidents in large networks: Network traffic
1 Capture all the traffic2 Someone reports an incident3 Run Snort on the captured traffic
• Two countries examples:• 30 Gb Netflow Traffic for a 20 millions people country per
24 hours (about 1700 events/s; 510 000 events/5 mn)
• 5 min Netflow Capture on the main backbone on a 45millions people country: 3 millions events/5 mn
7/54
Find incidents in large networks: Network traffic
1 Capture all the traffic2 Someone reports an incident3 Run Snort on the captured traffic
• Two countries examples:• 30 Gb Netflow Traffic for a 20 millions people country per
24 hours (about 1700 events/s; 510 000 events/5 mn)• 5 min Netflow Capture on the main backbone on a 45
millions people country: 3 millions events/5 mn
7/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
8/54
Capture with libpcap
u_char ∗packet ;struct t imeva l packet_tv ;struct pcap_pkthdr pheader ;
. . .
packet = ( u_char ∗ ) pcap_next ( pcaph , &pheader ) ;while ( packet ) {
packet_tv = pheader . t s ;t = packet_tv . tv_sec ;s t r t i m e = ct ime (& t ) ;i f ( ntohs ( ether−>eth_type ) == ETH_TYPE_IP) {
i p = ( struct i p_hdr ∗ ) ( packet + ETH_HDR_LEN ) ;
. . .
9/54
How does libpcap works?
• Layer 2• Packet copied! (ahah)• Apply a BPF filter• Get the data
10/54
Netfilter QUEUE (nfqueue)
11/54
DAQ
(Awesome) Data Acquisition Library written by Sourcefire.Available from http://www.snort.orgUnifies:• AFPacket• ipqueue• netfilter_queue• libpcap
12/54
Other ways to capture
• Daemonlogger: relies on libpcap• Streams2: relies on libpcap just for BPF• Various works from Luca Deri with PF_RING• using GPGPU
2git clone git://git.carnivore.it/streams.git13/54
Now you (perhaps) got your packet!
The packet is captured, fine! however:• It can be fragmented• If you run a signature maching, UTF-8 encoding can
bypass it• A protocol like RPC need to be decoded• The attack can be located at different DoD model levels
14/54
Fragmentation
Let’s have a look at Linux:• IPV4: linux-src/net/ipv4/ip_fragment.c• IPV6: linux-src/net/ipv6/reassembly.c
How it is performed in IPV4:• Defragmentation happens with the function ip_defrag()• Called only by:
• ip_local_deliver()• ip_call_ra_chain: only if the socket is tied to an interface
15/54
• Linux does not defragment upon FORWARD• Netfilter may do it• modprobe nf_conntrack_ipv4
16/54
We captured, we want evils!
Snort gives up several ways to find the evil:• Binary:content:"|0A 00 00 01 85 04 00 0080|root|00|" (sid:1775)
• Simple pattern:content:"fuck fuck fuck" (sid:1316)
• PCRE:pcre:"/ˆ x3c(REQIMG|RVWCFG) x3e/ism"(sid:2460)
Problem: How Snort manages pattern matching algorithmsalong with PCRE? Each PCRE is tried on each packet?
17/54
snort PCRE lookup
• Long patterns are easier to find• PCRE and pattern matching within Snort:
• Search for the longest pattern in each signature• function fpAddLongestContent() in fpcreate.c
• The traffic is prequalifed (MPSE)• Rules aare sequentially tested• The PCRE option is ignored until the complete rule test
after the prequalification
• PCRE uses its own DFA/NFA
⇒ Less we have PCRE, better we are.
18/54
Netflow
• It is easier to investigate with connection flow• Looking at TCP SYN is better for understanding than the
whole SYN>SYN-ACK>ACK>PSH>PSH-ACK, etc.• Streams was designed to help you there
19/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
20/54
Logs
Logs highly used for forensic activity for cybercrimeinvestigation
Question: who cares about logs? their weakness,normalization, etc.?
21/54
Logs
Logs highly used for forensic activity for cybercrimeinvestigation
Question: who cares about logs? their weakness,normalization, etc.?
21/54
SSH defaults accounts testing
sshd [ 6 5 7 4 ] : e r r o r : PAM: Au then t i ca t i on f a i l u r e f o r roo t from 192.168.12.2sshd [ 6 5 7 4 ] : e r r o r : PAM: Au then t i ca t i on f a i l u r e f o r guest from 192.168.12.2sshd [ 6 5 7 4 ] : e r r o r : PAM: Au then t i ca t i on f a i l u r e f o r p r i n t e r from 192.168.12.2sshd [ 6 5 7 4 ] : e r r o r : PAM: Au then t i ca t i on f a i l u r e f o r l p from 192.168.12.2sshd [ 6 5 7 4 ] : e r r o r : PAM: Au then t i ca t i on f a i l u r e f o r admin from 192.168.12.2
22/54
Detection dilemna
1 Detecting• A user enumeration is more likely to get caught and
correlated• Use tools like OSSEC and get it right in your mailbox• OSSEC and any other tools like that need logs to analyze
and detect things2 Log analyzers common weaknesses
• Signature based• PCRE based (with PCRE weaknesses as well, but this is
for an other talk)• Needs food == Needs logs
23/54
Know Your Enemy
Log analyzer enemy == Configurable log
24/54
Squid
Log Format configuration
l og fo rmat squid %t s .%03 tu %6t r %>a %Ss/%03>Hs %<s t %rm %ru %un %Sh/%<A %mt
Log Format options
. . .[ h t t p : : ] rm Request method (GET/POST etc )[ h t t p : : ] ru Request URL[ h t t p : : ] rp Request URL−Path exc lud ing hostname. . .
25/54
ProFTPd
Log with mod_log
Log Format configuration
LogFormat d e f a u l t "%h %l %u %t \"% r \ " %s %b "
Log Format options
%A − Anonymous username ( password given )%a − Remote c l i e n t IP address%b − Bytes sent f o r request
26/54
Apache
Log with mod_log
Log Format configuration
LogFormat "%h %l %u %t \"% r \ " %>s %b \"%{ Referer } i \ " \ "%{ User−Agent } i \ " " combined
Cool options!• %b did you see this %b?• %b: Size of response in bytes, excluding HTTP headers.
In CLF format, i.e. a ’-’ rather than a 0 when no bytes aresent.
• It is possible to exploit this weakness
27/54
Log misuse 0-day
A log misuse 0-day is:• an application fails to properly log an information it could• log injection• incorrect logged information
There is NO log misuse 0-day database!
28/54
Simple Log misuse 0-day
Back on ProFTPd, remember:
Log Format options
%A − Anonymous username ( password given )
password given = gets anything
Code managing the password
# def ine PR_TUNABLE_PATH_MAX 1024char arg [PR_TUNABLE_PATH_MAX+1] = { ’ \ 0 ’ } ;
case META_ANON_PASS:argp = arg ;pass = pr_ tab le_ge t ( session . notes , " mod_auth . anon−passwd " , NULL ) ;i f ( ! pass )
pass = "UNKNOWN" ;
ss t rncpy ( argp , pass , s i z e o f ( arg ) ) ;
→ Remote log injection possible, in /var/log/proftpd/auth.log
29/54
Log misuse database
Actually there is CWE. . .• Common Weakness Enumeration• CWE-778: Insufficient Logging
"When a security-critical event occurs, the software eitherdoes not record the event or omits important details aboutthe event when logging it."
30/54
CVE examples
• CVE-2003-1566: Microsoft IIS 5.0 does not log requeststhat use the TRACK method, which allows remoteattackers to obtain sensitive information without detection.
• CVE-2007-3730: OpenVMS does not log the source IP.• CVE-2008-1203: Adobe ColdFusion 8 and ColdFusion
MX7 do not log failed connection attempts on theadministrative interface.
• . . .
Those CVE are still under review
31/54
YASA! (Yet Another Stealth Attack)
Ever seen this attack?
66.249.65.39 - - [28/Mar/2007:03:08:46 +0200] "GET /index.htmlHTTP/1.1" 404 394 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;+http://www.google.com/bot.html)"
32/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
33/54
My laptop has a NVIDIA Geforce GT 420M
• 96 CUDA cores• Memory Bandwidth 25.6 GB/sec• A Thread block can run up to 512 threads
34/54
CUDA architecture
35/54
CUDA processing flow
36/54
Capture using CUDA: NetGPU
Available from http://code.google.com/p/netgpu
37/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
38/54
Problems with SIEM and Intrusion Detection
• Capture is complex• Rulesets are required: always after the problem• Too many false positives
39/54
Why Visualization
Handle large data without extracting known events to correlateyourself.
40/54
Secviz
Visualization community website: http://www.secviz.org
41/54
Circos
42/54
Limitation
Enough with limitations.
43/54
How many events are in this picture?
44/54
How many events are in this picture?
45/54
Discover a successful attack in less than one minute
46/54
Discover a successful attack in less than one minute
47/54
Discover a successful attack in less than one minute
48/54
Discover a successful attack in less than one minute
49/54
Discover a successful attack in less than one minute
50/54
Discover a successful attack in less than one minute
51/54
1 Introduction
2 Network Capture
3 Logs Capture
4 CUDA
5 Visualization
6 Conclusion
52/54
Conclusion
• Data are obviously lost in the noise of events today• If we are creative, we may be able to solve this issue• We have some technical limitations, we need to find ways
to get around them
• We have some technical solutions (hint: SIEM), we need tofind ways to get around them
• I strongly believe visualization has a great role to play in it
53/54
Conclusion
• Data are obviously lost in the noise of events today• If we are creative, we may be able to solve this issue• We have some technical limitations, we need to find ways
to get around them• We have some technical solutions (hint: SIEM), we need to
find ways to get around them• I strongly believe visualization has a great role to play in it
53/54
Questions?
• Email: [email protected]• Company website: http://www.picviz.com• Twitter: @tricaud• Blog: http://logviz.blogger.com
54/54