how vxlan works on l2 and across l3 networks ?

27
VXLAN By Anand Nande

Upload: anand-nande

Post on 08-Jan-2017

404 views

Category:

Technology


4 download

TRANSCRIPT

VXLAN By Anand Nande

AGENDA

○ What is VXLAN ?○ Why VXLAN ?○ How does it work ?○ So now we can migrate VMs across subnets?○ What about routing across VXLANs? ○ Any Performance Impact?○ Demo

What is it ?Virtual eXtensible Local Area Network

● Tunneling protocol co-developed by a group of companies.

● Original description as per the RFC:

“ Used to address the need for overlay networks within virtualized data centers accommodating multiple tenants. The scheme and the related protocols can be used in networks for cloud service providers and enterprise data centers ”

● Encapsulation method to extend L2 traffic over a L3 network. VXLAN uses multicast in the backend. It uses traditional flood and learn for MAC learning and Address Resolution Protocol (ARP) resolution.

Why VXLAN?What problems does it address and how am I selling it

It's intended to solve the following issues

● Limitations Imposed by Spanning Tree and VLAN Ranges○ STP blocks the use of links to avoid the replication and looping of frames.

■ This is a prob to some DC admins who pay for ports and links ■ Resiliency due to multipathing - not available.

○ 2^12 = 4096■ A 12-bit VLAN ID used to divide multiple broadcast domains■ STP blocks few id’s from this domain

● Multi-tenant Environments○ Is it possible to address need for multiple VLANs per tenant w/ 4096 limit?

○ L3 networks not a comprehensive solution. 2 tenants might use the same set of Layer 3

addresses within their networks.

● Inadequate Table Sizes at ToR Switch

How does it work?Let’s not draw conclusions yet..there’s a lot more to it

● VXLAN encapsulates Layer 2 frames into Layer 3 packets (using UDP)

● Adds a 24-bit VXLAN Network Identifier (VNI) that allows for up to 16 million unique combinations

● VXLAN Segments are built between VXLAN Tunnel Endpoints (VTEPs)...more on this in coming slides

● Not understood by any physical networking devices (the transport that carries the encapsulated frames only needs an IP-based network).

VXLAN Encapsulation

The Final packet that goes out on L-3.Note how inner-MAC is payload here and outer-MAC + UDP is added.

The VNI is the only distinguished identifier for a VXLAN segment

The L2 Ethernet Frame

VTEPs across a L3 network

VM1-1VM2-1

L3 Network

172.16.1.0/24 VNI=10

VM1-2 VM2-2

192.168.1.0/24 VNI=20 IP=172.16.1.10MAC=52:54:00:0e:08:b3VNI=10

IP=172.16.1.12MAC=52:54:00:30:de:e3VNI=10

IP=192.168.1.100MAC=00:0C:29:2F:32:A0VNI=20

IP=192.168.1.111MAC=00:0C:29:2F:23:A0VNI=20

VTEP-1 VTEP-2

10.0.0.1 10.0.0.2

VXLAN segment-1 w/ VNI=10

VXLAN segment-2 w/ VNI=20

Hypervisor 1 Hypervisor 2

VTEPs responsible for encap/decap packets at both ends of the wire

VXLAN Multicast group

MAC VNI ID REMOTE VTEP

52:54:00:a0:1b:bb 10 192.168.122.186

52:54:00:8a:bd:ff 10 192.168.122.101

MAC VNI ID REMOTE VTEP

52:54:00:60:18:f9 10 192.168.122.141

52:54:00:8a:bd:ff 10 192.168.122.101

MAC VNI ID REMOTE VTEP

52:54:00:a0:1b:bb 10 192.168.122.186

52:54:00:60:18:f9 10 192.168.122.141

VTEP-1s table

VTEP-2s table

VTEP-3s table

packet capture on one of the VTEP’s

Additional Wireshark plugin required to analyse the UDP data here

So now can we migrate VMs across subnets?

..the answer is ‘YES - we can’

Whats the most important thing when we migrate a VM ?

- VM’s ip and mac addresses should not change even if the VM has been migrated to a new hypervisor.

- We use the reference controller which learns the new VM-placement using ovsDB. # ps -ef | grep controller | grep -v color

root 1396 1325 0 Feb17 pts/3 00:00:01 controller -v ptcp:6633

- This reference controller is located on each VTEP.

VM communication via VTEP post-migration

VM2-1172.16.1.0/24 VNI=10

VM1-1VM1-2

IP=172.16.1.10MAC=52:54:00:0e:08:b3VNI=10

IP=172.16.1.12MAC=52:54:00:30:de:e3VNI=10

IP=192.168.1.100MAC=00:0C:29:2F:32:A0VNI=20

VTEP-1 VTEP-2

10.0.0.1 10.0.0.2

Hypervisor 1 Hypervisor 2

* The VM's port is updated to point to the new host.

* When the port is updated by Nova, Neutron executes ML2 mechanism drivers, including l2pop, which then sends RPC messages to the new host.

Flows related to vm1-1 removed on VTEP-1 and added to VTEP-2

What about routing across VXLANs ?

Lets look at the 2 way communication that happens

physical:virtual

[VLAN-to-VXLAN]

- also known as ‘vxlan-bridging’ that extends

the l2-domain over a vast l3-network.

- VTEP’s are responsible in this case to the

check for the VNI id in the VXLAN header. If

it matches the one of the VNI’s thats in its

table, forward the packet to the relevant

vxlan-segment to take care of it.

- Switch should be capable of vxlan handling.

virtual:physical

[VXLAN-to-VLAN]

- vlan to router

- hardware VTEP(router) only does bridging

- on physical router:

- in_ports, out_ports, loopback_ports

1. in_ports > bridge into a vlan > lo_ports

2. router_fib > next_hop > physical_world

- performance loss because of loopback

Any Performance Impact?cliche but true “Your loss is someone else's gain”

● An overlay network is simply a computer network which is built on top of another network.

● The added flexibility comes at a cost, due to the additional processing overhead for encapsulation and de-encapsulation of packets. This consumes both CPU resources and degrades network performance, especially for high speed connections.

● By introducing hardware offloading capabilities that can be found in some of today’s modern NICs, the added overhead for packet processing can be offloaded to the NIC hardware, resulting in improved CPU utilization and higher throughput.

● Improve performance of VXLAN w/ hardware offload: https://access.redhat.com/articles/1390483

DemoLet’s put this into action by creating a simple topology

192.168.122.101192.168.122.186

192.168.122.141

Ignore for the purpose of demo. You can do this later as an advanced exercise

[On your KVM host or laptop]

# yum install virt-manager libvirt* qemu-kvm openvswitch# cd /var/lib/libvirt/images

## Get the mininet vmdk image from the mininet website

## write the spawn.sh

# chmod +x spawn_vm.sh# ./spawn_vm.sh <vmname>

(note you need to spawn 2 mininet VMs, hence use the above script 2wice)

Hands on

# cat > spawn.sh << EOM> #!/bin/sh> export VM_NAME=$1> export portgroup=$2> IMAGES_BASE=/var/lib/libvirt/images> cp $IMAGES_BASE/mininet.vmdk $IMAGES_BASE/$VM_NAME.vmdk>> virt-install -r 256 \> -n $VM_NAME \> --vcpus=1 \> --import \> --autostart \> --memballoon virtio \> --network network=host-only\> --disk $IMAGES_BASE/$VM_NAME.vmdk \> EOM

[Inside your Mininet VM-1]

Create a simple topology:mininet@mininet-vm:~$ sudo mn --topo single,1Create vxlan interface on each side:mininet> sh ovs-vsctl add-port s1 vxlanSetup vxlan interface on each sidemininet> sh ovs-vsctl set interface vxlan type=vxlan option:remote_ip=192.168.122.101Setup IP for hosts:mininet> h1 ifconfig h1-eth0 10.0.0.1

[Inside your Mininet VM-2]

mininet@mininet-vm:~$ sudo mn --topo single,1

mininet> sh ovs-vsctl add-port s1 vxlan

mininet> sh ovs-vsctl set interface vxlan type=vxlan option:remote_ip=192.168.122.186

mininet> h1 ifconfig h1-eth0 10.0.0.2

Advanced VXLAN configuration

Manually assign VXLAN Network ID (VNI) and/or OpenFlow port number

e.g. VNI=20, OF_PORT=9

# ovs-vsctl set interface vxlan type=vxlan option:remote_ip=192.168.122.186 option:key=5566 ofport_request=9

Assign VNI flow by flow (Useful when implementing multi-tenant environment)

# ovs-vsctl set interface vxlan type=vxlan option:remote_ip=140.113.215.200 option:key=flow ofport_request=9

and then setup your flow entries by

# ovs-ofctl add-flow s1 in_port=1,actions=set_field:5566->tun_id,output:9

# ovs-ofctl add-flow s1 in_port=9,tun_id=5566,actions=output:1

References

- IETF RFC : tools.ietf.org/html/rfc7348- Scott Lowe’s blog : http://blog.scottlowe.org/- David Mahlers youtube channel : https://www.youtube.com/user/mahler711- VMware VXLAN Performance Evaluation : https://www.vmware.com/files/pdf/techpaper/VMware-

vSphere-VXLAN-Perf.pdf- TCP/IP over VXLAN Bandwidth Overheads: http://packetpushers.net/vxlan-udp-ip-ethernet-

bandwidth-overheads/