phd2013 lyamin Высокий пакетрейт на x86-64, берем планку 14.88mpps

31
Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps <[email protected]>

Upload: alexander-lyamin

Post on 24-May-2015

2.433 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps

<[email protected]>

Page 2: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

01.2012-05.20137594 incidents total

2012-01

2012-02

2012-03

2012-04

2012-05

2012-06

2012-07

2012-08

2012-09

2012-10

2012-11

2012-12

2013-01

2013-02

2013-03

2013-04

2013-05

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

spoof full-connect

Page 3: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Что модно?

• UDP Flood and amplification.• TCP ( SYN ( open|closed|firewalled) | ACK )• ICMP Flood ( smurf )

L7 – is out of style

Page 4: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Кто виноват?

Page 5: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Долбанные инопланетянеstatic unsigned int tcp_timeouts[TCP_CONNTRACK_TIMEOUT_MAX] __read_mostly = {    [TCP_CONNTRACK_SYN_SENT]    = 2 MINS,                              [TCP_CONNTRACK_SYN_RECV]    = 60 SECS,                              [TCP_CONNTRACK_ESTABLISHED] = 5 DAYS,    [TCP_CONNTRACK_FIN_WAIT]    = 2 MINS,                              [TCP_CONNTRACK_CLOSE_WAIT]  = 60 SECS,                              [TCP_CONNTRACK_LAST_ACK]    = 30 SECS,                              [TCP_CONNTRACK_TIME_WAIT]   = 2 MINS,    [TCP_CONNTRACK_CLOSE]       = 10 SECS,                              [TCP_CONNTRACK_SYN_SENT2]   = 2 MINS,/* RFC1122 says the R2 limit should be at least 100 seconds.           Linux uses 15 packets as limit, which corresponds   to ~13-30min depending on RTO. */    [TCP_CONNTRACK_RETRANS]     = 5 MINS,                              [TCP_CONNTRACK_UNACK]       = 5 MINS,                           };

Page 6: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Кто еще виноват?top - 08:16:23 up 39 min,  1 user,  load average: 0.44, 0.16, 0.79Tasks: 158 total,   2 running, 156 sleeping,   0 stopped,   0 zombieCpu0  :  0.0%us,  0.0%sy,  0.0%ni, 89.3%id,  0.0%wa,  0.0%hi, 10.7%si,  0.0%stCpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu8  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%stCpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu11 :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stMem:  32921100k total,  4598792k used, 28322308k free,    15496k buffersSwap:        0k total,        0k used,        0k free,    83252k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND               39 root      20   0     0    0    0 R  100  0.0   0:27.91 [ksoftirqd/8]       1401 root      20   0     0    0    0 S    8  0.0   0:03.05 [kpktgend_8]       5346 root      20   0     0    0    0 S    2  0.0   0:00.34 [kworker/8:0]       5740 root      20   0 19356 1472 1076 R    1  0.0   0:00.12 top

Page 7: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Кто еще виноват?

ВЫ(забыли настроить сетевой стэк)

Page 8: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Cферический сервер ввакууме

Intel(R) Xeon(R) CPU E5-2670 x2X520-DA2 (Intel® 82599ES)

Vanilla Linux 3.7.9

modprobe ixgbe(3.13.10) RSS=8

Page 9: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Как быть ?

AFFINITY > BALANCER%/etc/init.d/irqbalancer stop

%grep eth8 /proc/interrupts 123: 19 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 0 124: 0 15 0 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 1 125: 0 0 15 0 0 0 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 2 126: 0 0 0 15 0 0 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 3 127: 0 0 0 0 15 0 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 4 128: 0 0 0 0 0 15 0 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 5 129: 0 0 0 0 0 0 17 0 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 6 130: 0 0 0 0 0 0 0 15 1 0 0 0 0 0 0 0 PCI MSI edge eth8 TxRx 7

Page 10: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Лучше?top - 07:40:25 up 3 min,  1 user,  load average: 4.61, 1.29, 0.44 Tasks: 164 total,   9 running, 155 sleeping,   0 stopped,   0 zombie Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu1  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu2  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu3  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu4  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu5  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu6  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu7  :  0.0%us,  0.0%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi, 50.2%si,  0.0%st Cpu8  :  0.0%us,  0.0%sy,  0.0%ni, 90.2%id,  0.0%wa,  0.0%hi,  9.8%si,  0.0%st Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu14 :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st Mem:  32921100k total,  4597288k used, 28323812k free,    15340k buffers Swap:        0k total,        0k used,        0k free,    83240k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND               15 root      20   0     0    0    0 R   96  0.0   0:46.06 [ksoftirqd/2]         23 root      20   0     0    0    0 R   96  0.0   0:46.04 [ksoftirqd/4]         11 root      20   0     0    0    0 R   95  0.0   0:46.04 [ksoftirqd/1]         19 root      20   0     0    0    0 R   95  0.0   0:46.03 [ksoftirqd/3]         27 root      20   0     0    0    0 R   95  0.0   0:46.02 [ksoftirqd/5]         31 root      20   0     0    0    0 R   95  0.0   0:46.08 [ksoftirqd/6]         35 root      20   0     0    0    0 R   95  0.0   0:46.04 [ksoftirqd/7]          3 root      20   0     0    0    0 R   93  0.0   0:45.23 [ksoftirqd/0]

Page 11: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Более лучше?# ethtool K eth8 ntuple on # ethtool U eth8 flow type udp4 action 1 Added rule with ID 8189 # ethtool u eth88 RX rings available  Total 1 rules  Filter: 8189  Rule Type: UDP over IPv4  Src IP addr: 0.0.0.0 mask: 255.255.255.255Dest IP addr: 0.0.0.0 mask: 255.255.255.255  TOS: 0x0 mask: 0xff  Src port: 0 mask: 0xffff  Dest port: 0 mask: 0xffff  VLAN EtherType: 0x0 mask: 0xffff  VLAN: 0x0 mask: 0xffff  User defined: 0x0 mask: 0xffffffffffffffff  Action: Drop

Page 12: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Более лучше!(здесь ~14.88Mpps UDP)

Tasks: 163 total,   1 running, 162 sleeping,   0 stopped,   0 zombieCpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu10 :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stCpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stMem:  32921100k total,  4374344k used, 28546756k free,     7700k buffersSwap:        0k total,        0k used,        0k free,    24036k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           4348 root      20   0 19356 1476 1076 R    1  0.0   0:00.03 top                    1 root      20   0  4120  688  588 S    0  0.0   0:01.22 init [3]              2 root      20   0     0    0    0 S    0  0.0   0:00.00 [kthreadd]        

Page 13: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Поприветствуем Flow Director

The flow director filters identify specific flows or sets of flows and routes them to specific queues. The flow director filters are programmed by FDIRCTRL and all other FDIR registers. The 82599 shares the Rx packet buffer for the storage of these filters.

Page 14: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Flow Director умеет

• Perfect match filters — The hardware checks a match between the masked fields of the received packets and the programmed filters. Masked fields should be programmed as zeros in the filter context. The 82599 support up to 8 K - 2 perfect match filters.

• Signature filters — The hardware checks a match between a hash-based signature of the masked fields of the received packet. The 82599 supports up to 32 K - 2 signature filters.

Page 15: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Perfect Filter умеют(instanteneously)

• VLAN• proto• src_ip/mask• src_port• dst_ip/mask• dst_port• Flexible 2-byte tuple anywhere in the first 64

bytes of the packet (FRAME!)

Page 16: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Not so perfect

(Выкидыш FlowDirector)

• Потребляют память RX buffer (256/512)• Не умеют ЕСЛИ-ТО• Masks are GLOBAL for signature filters• 64b это до обидного мало• Поддерживается ethtool (perfect, buggy) и

PF_RING(signature only)

Но и на том Intel SPASIBO!

Page 17: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Flex Filters

(Выкидыши реализации RSS)

• 128b of the packet (FRAME!)• 6 filters• Кратковременно отключаются при

доступе(R|W)• Нет публично доступного userland

конфигуратора.

Page 18: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Как быть с TCP SYN?

• SYN без Seq Number• SYN без MSS• … и прочие ляпы где можно вывести

сигнатуру до первых 128b

Page 19: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Как быть с Perfect TCP SYN ?

Больно умереть на 400kPPS…

Page 20: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Post mortem# ========## Samples: 19K of event 'cycles'# Event count (approx.): 12923232073## Overhead Command Shared Object Symbo l# ........ ........... ................. .................................... .# 78.74% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock | _raw_spin_lock | | 98.84% tcp_v4_rcv | ip_local_deliver_finish | ip_local_deliver | ip_rcv_finish | ip_rcv | __netif_receive_skb | netif_receive_skb | napi_skb_finish | napi_gro_receive | 0xffffffffa005c134 | net_rx_action | __do_softirq | run_ksoftirqd | smpboot_thread_fn | kthread | ret_from_fork

Page 21: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

net/ipv4/tcp_ipv4.c

process: if (sk >sk_state == TCP_TIME_WAIT) goto do_time_wait; if (unlikely(iph >ttl < inet_sk(sk) >min_ttl)) { NET_INC_STATS_BH(net, LINUX_MIB_TCPMINTTLDROP); goto discard_and_relse;} if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; nf_reset(skb); if (sk_filter(sk, skb)) goto discard_and_relse; skb >dev = NULL; bh_lock_sock_nested(sk);ret = 0; if (!sock_owned_by_user(sk)) {

[dd]

} bh_unlock_sock(sk);sock_put(sk);return ret;

Page 22: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Опять кто-то виноват..

• Обнаружитель SYNFLOOD• TCP Cookie Transactions• MD5SUM

Page 23: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Запилим Инновационный Костыль!

*rawpost:POSTROUTING ACCEPT [15:1548] A POSTROUTING s 10.1.0.0/24 o eth8 j RAWSNAT to source 10.10.40.3/32COMMIT# Completed on Mon May 20 04:47:30 2013# Generated by iptables save v1.4.16.3 on Mon May 20 04:47:30 2013*raw:PREROUTING ACCEPT [28:2128]:OUTPUT ACCEPT [18:2056] A PREROUTING d 10.10.40.3/32 m cpu cpu 0 j RAWDNAT to destination 10.1.0.1/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 1 j RAWDNAT to destination 10.1.0.2/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 2 j RAWDNAT to destination 10.1.0.3/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 3 j RAWDNAT to destination 10.1.0.4/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 4 j RAWDNAT to destination 10.1.0.5/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 5 j RAWDNAT to destination 10.1.0.6/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 6 j RAWDNAT to destination 10.1.0.7/32 A PREROUTING d 10.10.40.3/32 m cpu cpu 7 j RAWDNAT to destination 10.1.0.8/32COMMIT

Page 24: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

WIN!

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

500

1000

1500

2000

2500

3000

RSS

kPPS

Page 25: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Вопросы?

Page 26: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

На сладкое

А что будет если послать пакет на не слушаемый порт?

А что если послать много-много пакетов?

Page 27: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Linux 3.5.7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

100

200

300

400

500

600

RSS

KPPS

Page 28: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

net/ipv4/ip_output.c bh_lock_sock(sk); inet >tos = arg >tos; sk >sk_priority = skb >priority; sk >sk_protocol = ip_hdr(skb) >protocol; sk >sk_bound_dev_if = arg >bound_dev_if; ip_append_data(sk, &fl4, ip_reply_glue_bits, arg >iov >iov_base, len, 0, &ipc, &rt, MSG_DONTWAIT); if ((skb = skb_peek(&sk >sk_write_queue)) != NULL) {}if (arg >csumoffset >= 0) *((__sum16 *)skb_transport_header(skb) + arg >csumoffset) = csum_fold(csum_add(skb >csum, arg >csum));skb >ip_summed = CHECKSUM_NONE;ip_push_pending_frames(sk, &fl4);bh_unlock_sock(sk);

Page 29: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Спасибо Эрик!commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046Author: Eric Dumazet <[email protected]>Date: Thu Jul 19 07:34:03 2012 +0000 ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace.

This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations.

To better resist to attacks, we use a percpu socket. Each cpu canrun without contention, using appropriate memory (local node)

Page 30: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

Спасибо Эрик!

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

500

1000

1500

2000

2500

3000

3500

4000

4500

RSS

KPPS

Page 31: Phd2013 lyamin  Высокий пакетрейт на  x86-64, берем планку 14.88Mpps

К черту мушкетеров,ПЯТНИЦА!