dos operational efforts 2015/09/14 michael hare uw system network
TRANSCRIPT
DoS operational efforts
2015/09/14Michael Hare
UW System Networkhttps://stats.uwsys.net/
Protecting the control plane
• Cisco 4500X, Cisco 1921: • SNMP/VTY ACLs
• Juniper MX: • In addition to SNMP/VTY ACLs, we also ACL NTP, Radius, DNS and routing
[BFD, BGP, IGMP LDP, MSDP, OSPF, PIM, RSVP, VRRP] as tightly as possible. [Happy to share specifics offline, just ask]• The routing engine has to also handle things like ARP, LACP, NDP, PVSTP so
packets that make it past all of the above are subject to compliance policers that sit between the forwarding engine and routing engine punt path. This Juniper feature is called “ddos-protection” and ensures the routing engine doesn’t fall over during abuse [DoS] or accident [bridge loop]
Juniper DDoS protection
• Does NOT work on the forwarding path, only packets destined to be handled by the routing engine.• Classifies punt path packets into various categories. [ARP, BGP, etc]• Packet rate policer, burst, logging and detection [flow, IFL] tweakable
per category.• Operational data is collected over XML every 5 minutes. We use this
data to set parameters, detect policed packet events, etc.
Juniper DDoS protection: example
• syslog of policed packet event• r-uwsuperior-hub-2: %DAEMON-4-DDOS_SCFD_FLOW_FOUND: A new flow of
protocol ARP:aggregate on irb.157 with source addr -- -- -- is found at 2015-09-14 09:58:55 CDT
• Graph [all stats available in GNMIS]• https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=102&c=9a401e0b465505c4
• Settings: • output of “r-uwsuperior-hub-2-re0> show ddos-protection protocols arp” is
too big for the powerpoint.
Monitoring: data collection
• SNMP and ICMP monitoring of availability and usage via FIDO• XML data collection via screen scraping [separate from FIDO]• IPFIX flow export [1:256 sampling rate] anycasted with samplicator to
nfcapd• Juniper firewall filter counters [XML collection]
SNMP/ICMP/XML specifics
• FIDO is running active/active from Madison and Milwaukee• FIDO stores time series data on a variety of datapoints• ICMPv4/ICMPv6, ifTable [discards, errors, octets, packets], CPU
• XML scraping is running active/passive due to Juniper control plane limits, however, the resultant data is stored in both Madison and Milwaukee. We use XML where SNMP fails us [or is too burdensome]• CoS queue monitoring, firewall filter stats, DOM, BGP route stats,
temperature monitoring
IPFIX
• IPv4/IPv6 only [no family MPLS]• Collecting ingress on all IP interfaces on MX2010s• Collecting ingress on all IP customer handoffs on MX104s; not all IP
traffic between UWs must pass through an MX2010.• We can collect on egress on IP customer handoffs on MX104s. We
are replaying collected flows [delayed 5 minutes] to at least one UW campus.• NFDUMP/NFSEN/home grown software in use for flow analysis and
storing data into RRDs
Flow rrd data
• Lots of categories tracked in time series:• On-net vs on-net [across the entire system], v4 vs v6, etc: • Commodity vs Research vs Peering:
• https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=103&c=8cd1d65a85db5d8f• Subnet level, Protocol and Port info
• Subnet: https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=104&c=0fd94f6c9e85890a• Protocol: https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=105&c=8cc1cb05ef6b01b7• Port: https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=107&c=954532f579e74027
• AS Statistics [but CDNs limit the usefulness]• External subnets:
https://stats.uwsys.net/cgi-bin/shorten.fcgi?i=108&c=79084aed22e51be8• Domain / host info: utility hampered by method chosing [PTR lookups]. Would need all
client DNS lookups to be all seeing: -> https://flows.uwsys.net/cgi-bin/DNSQuery.fcgi
Juniper Firewall Filters
• Per-interface statistics enabled• Anything that can be matched in a Juniper ACL term can be counted• Anything that can be counted can be policed, but only by bps, not
pps. Bps limit appears to be a user interface limitation, not hardware limitation• Time series data is fed into a thresholding engine to do fast but crude
anomalous traffic detection
Juniper Firewall Filters example:
term count-flag-syn { from { tcp-flags syn; } then { count :count:tcp:flag-syn; next term; } }
term dns-packetsize-1400-to-9999 { from { packet-length 1400-9999; port 53; } then { count :count:dns-packetsize-1400-to-9999; next term; } }
A crazy amount of counters: control plane
group='fw-inet-protect-re‘
:accept:igmp:accepted :accept:udp:ldp-discover :accept:igmp-ldp-igmp :accept:tcp:ldp-unicast :accept:ospf:accepted :accept:pim:accepted :accept:rsvp:accepted :accept:tldp-discover :accept:vrrp :accept:dns :accept:mpls-traceroute:accepted :accept:udp:ntp :accept:radius :accept:bfd-multihop :accept:bfd :accept:tcp:BGP :discard:udp:ntp :accept:icmp:accepted :accept:tcp:MSDP :accept:udp:traceroute
group='fw-inet-remote-access' :accept:tcp-ftp :accept:udp:SNMP :accept:tcp:SSH :discard:remote-access :accept:tcp:established
A crazy amount of counters: forwarding plane
group='fw-bridge-count-traffic' :count:arp :count:broadcast :count:ipv4 :count:ipv6 :count:multicast-v4 :count:multicast-v6
group='fw-inet-block-application-traffic' :count:udp:ntp-other :accept:udp:ntp :accept:udp:ssdp-dns :accept:udp:ssdp
group='fw-inet-block-bogons' :discard:bogons
group='fw-inet-count-cos-traffic-input' :count:cos-assured-forwarding :count:cos-best-effort
group='fw-inet-count-traffic' :count:esp:traffic :count:fragment :count:icmp:traffic :count:tcp:flag-ack :count:tcp:flag-fin :count:tcp:flag-psh :count:tcp:flag-rst :count:tcp:flag-syn :count:tcp:flag-urg :count:tcp:traffic :count:udp:rpc :count:udp:traffic :count:udp:zero :count:dns-packetsize-0-to-64 :count:dns-packetsize-65-to-576 :count:dns-packetsize-1400-to-9999 :count:packetsize-0-to-64 :count:packetsize-65-to-128 :count:packetsize-129-to-512 :count:packetsize-513-to-1500 :count:packetsize-1501-to-9999
• Reporting and thresholding of RRD datapoints integrated into FIDO and other generated reports• https://stats.uwsys.net/cgi-bin/rrd_repor
ts.cgi
• GNMIS: silly name, web 1.0 looks, but lots of data• https://stats.uwsys.net/cgi-bin/
gnmis.fcgi
• Alarm, syslog and report analysis provides feedback to changing/improving Juniper ddos-protection, firewall filters, etc
• Significant events shared with [email protected] [should there be a security specific list?]
New fun: DNS DoS
• Volumetric UDP port 53 attacks: three observed this semester so far [UW Eau Claire, UW Madison, UWC Rock Co]• UDP Packet sizes > 1400 bytes are generally well under 1Mbps• TCP more likely for EDNS and DNSSEC although fragmented UDP RFC
compliant• US-CERT TA15-240A: Controlling Outbound DNS Access recommends
enterprise DNS only allowed via border• We could configure different port 53 policers based on destination address
[ie, unfettered or very loose access for enterprise DNS but stricter policers for random internal host port 53 traffic, etc]
FIN
• I have a powerpoint on MPLS, but don’t have time to present. This presentation and others are available at:
https://stats.uwsys.net/other/
Things I want to talk about if I have time, part 1: COS
• Current CoS queue setup
• network-control: strict 5% of line rate, then competes with best-effort• assured-forwarding: [ip based admission h323 endpoints]
• 1G links: strict 20% of line rate, then competes with best-effort• 10G links: strict 10% of line rate, then competes with best-effort• 100G links: strict 5% of line rate, then competes with best-effort
• expedited-forwarding: 20% of line rate, then competes with best-effort [currently, all mpls VPNs]• best-effort: what's left
Things I want to talk about if I have time, part 1: COS• Current CoS classification setup
• Core interfaces retain marking across all families [ip, ethernet, mpls]• Untrusted [customer] IP traffic is classified into best-effort unless they are admitted by prefix-list into assured-
forwarding [h323]• Untrusted [customer] ethernet traffic [bridging] is classified into best-effort
• l2circuit MPLS traffic is transmutted as follows:• ip TOS equivalent 0,1 [includes dscp] = best-effort• ip TOS equivalent 2-7 [includes dscp] = expedited forwarding
• E-VPN will likely be treated like l2circuit from a CoS perspective
• Q: Is this what we want? • Do we want DC backup to be demoted to best effort? Or less than best effort?• It's OK to have different CoS policies per VPN
Things I want to talk about if I have time, part 2: MPLS
• Using BGP more specific advertisements to ride out a DoS attack• uwsys.net MX104s as licensed can handle several hundred ‘more specific’
routes per UW [down to host route]• More specific routes stay local to uwsys.net AS and are used for ingress traffic
steering